[SERVER-81962] Sort unordered bulk inserts by time field for time-series Created: 06/Oct/23  Updated: 14/Nov/23  Resolved: 14/Nov/23

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Task Priority: Major - P3
Reporter: Gregory Wlodarek Assignee: Gregory Wlodarek
Resolution: Won't Do Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to SERVER-79480 Insert measurements into the best-fit... Closed
Assigned Teams:
Storage Execution
Sprint: Execution Team 2023-11-27
Participants:

 Description   

Compressed time-series buckets are sorted by time. The BSONColumnBuilders are append only and dealing with data sorted by time field will be more performant.



 Comments   
Comment by Gregory Wlodarek [ 14/Nov/23 ]

After some investigation, it was determined this wasn't needed. The idea here was to sort unordered bulk inserts by the time field because once we start maintaining the compression state in individual buckets, sorted-ness matters at insert time. But within a WriteBatch, we don't need to sort these until we're ready to commit. As long as the measurements going into that bucket have a time field greater than or equal to the time field of the last committed measurement in the bucket, we can add it to WriteBatch::BatchMeasurements and sort it when the write batch is finalized, which at that point we append it to the BSONColumnBuilders.

Generated at Thu Feb 08 06:47:51 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.