[SERVER-52523] implement in-memory bucket catalog to support time-series collections Created: 31/Oct/20  Updated: 29/Oct/23  Resolved: 25/Nov/20

Status: Closed
Project: Core Server
Component/s: Storage
Affects Version/s: None
Fix Version/s: 4.9.0

Type: Task Priority: Major - P3
Reporter: Benety Goh Assignee: Gregory Noma
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Problem/Incident
Related
related to SERVER-52522 transform inserts in a time-series co... Closed
is related to SERVER-53072 Expire entries in the bucket catalog ... Closed
Backwards Compatibility: Fully Compatible
Sprint: Execution Team 2020-11-30
Participants:
Linked BF Score: 36

 Description   

The bucket catalog serves two main purposes:

  • Allow efficient discovery of buckets that are not full for given meta-data and time.
  • Synchronize and batch concurrent updates to the same bucket.

Proposed implementation details:

  • The global bucket catalog has an in-memory thread-safe ordered map indexed by a tuple <nss, metadata, _id>. For each bucket it contains:
    • A vector of measurements to be inserted.
    • The data size of the bucket, which is the total BSON size of the data object of the BSON serialization of the bucket, including measurements to be inserted.
    • The total number of measurements in the bucket, including uncommitted measurements and measurements to be inserted.
    • The number of committed measurements in the bucket.
    • The number of current writers.
    • A set containing all new top level field names of the measurements to be inserted.
    • The set of top level field names of the measurements that have been inserted into the bucket.
    • Most recent commit info, such as the timestamp, cluster time, etc. required for the update return.
  • The catalog also has an "idle bucket" queue with references to all buckets that do not have writers. This queue allows expiring entries in the bucket catalog if their total size exceeds some (big) threshold. On step-down this queue is flushed, so the bucket catalog is empty.


 Comments   
Comment by Githook User [ 28/Nov/20 ]

Author:

{'name': 'Benety Goh', 'email': 'benety@mongodb.com', 'username': 'benety'}

Message: SERVER-52523 fix mac os visibility builder
Branch: master
https://github.com/mongodb/mongo/commit/90c0b86cc3d75d66eee164ab786a5126e03233b9

Comment by Githook User [ 25/Nov/20 ]

Author:

{'name': 'Gregory Noma', 'email': 'gregory.noma@gmail.com', 'username': 'gregorynoma'}

Message: SERVER-52523 Implement in-memory time-series bucket catalog
Branch: master
https://github.com/mongodb/mongo/commit/82aea1d428e3a06994d6624464b74b92e47eae2d

Generated at Thu Feb 08 05:28:17 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.