[SERVER-33418] Have index build completion also commit multikey information Created: 21/Feb/18  Updated: 29/Oct/23  Resolved: 21/Feb/18

Status: Closed
Project: Core Server
Component/s: Storage
Affects Version/s: None
Fix Version/s: 3.7.3

Type: Bug Priority: Major - P3
Reporter: Daniel Gottlieb (Inactive) Assignee: Daniel Gottlieb (Inactive)
Resolution: Fixed Votes: 0
Labels: rollback-non-functional
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Related
is related to SERVER-33503 Timestamp non-bulk multikey index com... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Sprint: Repl 2018-02-26
Participants:
Linked BF Score: 0

 Description   

Index builds on an existing collection are done in primarily two stages (correlated to the logging of stages 2 and 3). The first stage scans the collection, calculates each document's index key and inserts the results into the bulk builder. The bulk builder will sort all the keys and remember multikey information. The second stage is scanning through the bulk builder's output (which is in sorted order) and inserting into the storage engine's index builder. The index build is broken down this way as storage engine's may not be optimized for bulk loading randomly ordered data, but are optimized for bulk loads where elements are inserted in increasing order.

When building an index on an existing collection, setting multikey information is done between stages one and two. This write updates the catalog document for the collection but is not replicated. Thus it needs to explicitly be assigned a timestamp. However looking at the logical clock is error-prone in situations where lots of updates are being processed by the system and the stable timestamp is moving quickly.

Happily, the multikey update does not need to be done in that location and can be delayed until the index build completes; a write on a primary that is replicated. Having the timestamp assigned as part of a replicated operation does not carry the risk of assigning a stale value.



 Comments   
Comment by Githook User [ 21/Feb/18 ]

Author:

{'email': 'daniel.gottlieb@mongodb.com', 'name': 'Daniel Gottlieb', 'username': 'dgottlieb'}

Message: SERVER-33418: Set multikey at index commit time.
Branch: master
https://github.com/mongodb/mongo/commit/cf546a4ca0e96fb3bf68d44115fcde9a274ca450

Generated at Thu Feb 08 04:33:20 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.