[SERVER-36101] Replication should not depend on the presence of lastStableCheckpointTimestamp in status reports to identify recoverable rollback capable nodes Created: 12/Jul/18  Updated: 29/Oct/23  Resolved: 26/Jul/18

Status: Closed
Project: Core Server
Component/s: Storage
Affects Version/s: None
Fix Version/s: 4.1.2

Type: Task Priority: Major - P3
Reporter: Dianna Hohensee (Inactive) Assignee: Dianna Hohensee (Inactive)
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Documented
is documented by DOCS-11924 Docs for SERVER-36101: Replication sh... Closed
Gantt Dependency
has to be done before SERVER-36194 Remove the deprecated 'lastStableChec... Closed
has to be done after SERVER-35805 Maintain checkpoint thread timestamp ... Closed
Related
related to SERVER-33165 Don't return from ReplSetTest.initiat... Closed
Backwards Compatibility: Fully Compatible
Sprint: Storage NYC 2018-07-30
Participants:

 Description   

This might require surfacing some other indicator in the storage engine up into replication. Also replication test infrastructure changes.



 Comments   
Comment by Githook User [ 25/Jul/18 ]

Author:

{'name': 'Dianna Hohensee', 'email': 'dianna.hohensee@10gen.com', 'username': 'DiannaHohensee'}

Message: SERVER-36101 Replication should not depend on the presence of lastStableCheckpointTimestamp in status reports to identify recoverable rollback capable nodes
Branch: master
https://github.com/mongodb/mongo/commit/e7c2cbf88bc07549d634613049358214dbbaac4b

Comment by Dianna Hohensee (Inactive) [ 17/Jul/18 ]

New approach. Deprecating exposure of lastStableCheckpointTimestamp and adding a replacement field lastStableRecoveryTimestamp, to provide a generic interface that storage engines can choose how to implement.

Comment by Dianna Hohensee (Inactive) [ 12/Jul/18 ]

It looks like replSetGetStatus already returns stableTimestamp in the guise of res.optimes.readConcernMajorityOpTime.ts here, so I could just add checking for a non-zero value of that to this check. Then the shell helper function will act accordingly and be satisfied.

For reference, because I was curious, the requires_persistence tag protects inMemory from running with tests that restarts server nodes, so awaitLastStableCheckpointTimestamp() just needs to await a stable timestamp in the inMemory scenario.

Then periodic_kill_secondaries.py and replicaset.py need to be updated. replicaset.py would change similarly to the above. periodic_kill_secondaries.py needs the replSetTest command to return a stable timestamp.

Comment by Judah Schvimer [ 12/Jul/18 ]

It might be okay to just surface the 'stableTimestamp' as well as the 'lastStableCheckpointTimestamp' and non-durable storage engines can just wait on the former rather than the latter.

Generated at Thu Feb 08 04:42:02 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.