[SERVER-34172] Turn primary index build ghost writes into noop oplog writes. Created: 28/Mar/18  Updated: 29/Oct/23  Resolved: 31/Mar/18

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: None
Fix Version/s: 3.7.4

Type: Bug Priority: Major - P3
Reporter: Daniel Gottlieb (Inactive) Assignee: Daniel Gottlieb (Inactive)
Resolution: Fixed Votes: 0
Labels: rollback-non-functional
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Related
related to SERVER-48465 restore noop write msg format for sin... Closed
related to SERVER-42799 obtain timestamp for cleaning up inde... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Sprint: Repl 2018-04-09
Participants:
Linked BF Score: 50

 Description   

There are two metadata writes associated with an index build, the start and the completion (or failure). Primaries naturally timestamp a successful completion as that is in the same transaction that writes an oplog entry. On secondaries, the beginning of an index build is naturally timestamped as that is associated with processing an oplog entry.

For primaries, beginning and failing an index build need to also be timestamped. This is currently accomplished by looking at the logical clock and assigning that value as the timestamp. This has the race condition that the stable timestamp may race ahead after reading the logical clock, but before setting the timestamp on the index metadata write.

Instead, if the write goes through the oplog as a no-op entry, the write will be timestamped without the possibility of a race (the logical clock is read and the timestamp is set under a mutex. Why this prevents the stable timestamp from racing ahead is beyond what I'd like to explain here, but am happy to in person).

Note, it is legal for secondaries to look at the logical clock and use that to timestamp the metadata update on index completion. The index build is either in the foreground and no other operations are being processed, or the index build is in the background and acquiring a lock to perform the write prevents the replication from processing batches (via the Parallel Batch/aka peanut butter, lock).



 Comments   
Comment by Githook User [ 31/Mar/18 ]

Author:

{'email': 'daniel.gottlieb@mongodb.com', 'name': 'Daniel Gottlieb', 'username': 'dgottlieb'}

Message: SERVER-34172: Use noop writes to timestamp index completion/failure on primaries.
Branch: master
https://github.com/mongodb/mongo/commit/48c552d8b037fe3eb821290c3f924d413e44e7db

Generated at Thu Feb 08 04:35:47 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.