[SERVER-65481] Add efficient bulk builder for column indexes Created: 12/Apr/22  Updated: 29/Oct/23  Resolved: 08/Jul/22

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 6.1.0-rc0

Type: Task Priority: Major - P3
Reporter: Ian Boros Assignee: Justin Seyster
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Problem/Incident
Related
is related to SERVER-67979 Less strict check for 'numSpills' in ... Closed
Backwards Compatibility: Fully Compatible
Sprint: QE 2022-05-16, QE 2022-05-30, QE 2022-06-13, QE 2022-06-27, QE 2022-07-11, QE 2022-07-25
Participants:
Linked BF Score: 151

 Description   

When bulk building column indexes we can take advantage of the fact that a collection scan produces results in RID order. (We should confirm this with the storage exec team)

As we walk the collection, we maintain a table from (path -> [list of cells ordered by RID]). For each document we take out of the collection, we produce all of the cells for it, and append them to the corresponding list.

At the end, we sort by the path (the key of our table) and then insert each list into the index. This avoids doing a blocking sort over the entire set of keys.



 Comments   
Comment by Githook User [ 08/Jul/22 ]

Author:

{'name': 'Justin Seyster', 'email': 'justin.seyster@mongodb.com', 'username': 'jseyster'}

Message: SERVER-65481 Bulk shredding and loading for column store indexes
Branch: master
https://github.com/mongodb/mongo/commit/7e41399b007ac630ac3b0f0764c8b628be9f1be7

Comment by Githook User [ 30/Jun/22 ]

Author:

{'name': 'Sviatlana Zuiko', 'email': 'sviatlana.zuiko@mongodb.com', 'username': 'szuiko'}

Message: Revert "SERVER-65481 Bulk shredding and loading for column store indexes"

This reverts commit cb9472afc30d32d1c18691d64899c1aa72cdc43d.
Branch: master
https://github.com/mongodb/mongo/commit/ff5ce6771bd53616ed644ee794ba69c2fe6d91c3

Comment by Githook User [ 29/Jun/22 ]

Author:

{'name': 'Justin Seyster', 'email': 'justin.seyster@mongodb.com', 'username': 'jseyster'}

Message: SERVER-65481 Bulk shredding and loading for column store indexes
Branch: master
https://github.com/mongodb/mongo/commit/cb9472afc30d32d1c18691d64899c1aa72cdc43d

Generated at Thu Feb 08 06:02:50 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.