[SERVER-33806] Oldest timestamp can move ahead of the commit point. Created: 12/Mar/18  Updated: 29/Oct/23  Resolved: 13/Mar/18

Status: Closed
Project: Core Server
Component/s: Replication, Storage
Affects Version/s: None
Fix Version/s: 3.7.3

Type: Bug Priority: Major - P3
Reporter: Daniel Gottlieb (Inactive) Assignee: Daniel Gottlieb (Inactive)
Resolution: Fixed Votes: 0
Labels: rollback-functional
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Related
related to SERVER-47844 Update _setStableTimestampForStorage ... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Sprint: Repl 2018-03-26
Participants:
Linked BF Score: 0

 Description   

The `oldest timestamp` is the time to which the storage engine maintains history. It can service all reads with read_timestamp >= oldest_timestamp. The `commit point`/`committed snapshot` in replication is the timestamp which a majority of voting nodes have durably replicated. To service majority read (reads of data that cannot be rolled back) replication advances the commit point then updates the `stable_timestamp`. It's a subtle detail that updating the stable timestamp, internally updates the oldest timestamp to the same value.

However, there are conditions where ReplicationCoordinatorImpl::updateCommittedSnapshot_inLock does not, in fact move the commit point forward. This inaction is not captured in the return value and the calling function unconditionally follows by setting the stable timestamp. This leaves the server in a state where a majority read would fail — the server is no longer keeping enough history to satisfy a read at the commit point.

Notably, the `disableSnapshotting` failpoint can cause a consumer test, read_committed_on_secondary.js to fail.

It's unclear if the contract of `setStableTimestamp` should explicitly state the value may not be set ahead of the commit point. Or, whether the storage engine should consider exposing to steady state replication a way to advance the oldest timestamp where this relationship must instead be enforced.



 Comments   
Comment by Githook User [ 13/Mar/18 ]

Author:

{'email': 'daniel.gottlieb@mongodb.com', 'name': 'Daniel Gottlieb', 'username': 'dgottlieb'}

Message: SERVER-33806: Only update the stable/oldest timestamps when replication accepts its new commit point.
Branch: master
https://github.com/mongodb/mongo/commit/c3bf7f3d621cbe2824db460bae9c80b75c4d7870

Comment by Eric Milkie [ 12/Mar/18 ]

I think the stable timestamp must never be set ahead of the commit point. Therefore, we should fix that logic first, and that will in turn fix the oldest_timestamp logic as well.

Generated at Thu Feb 08 04:34:38 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.