Loading...

XML

Word

Printable

JSON

Type: Improvement
Resolution: Fixed
Priority: Major - P3
Fix Version/s: 4.2.2, 4.3.3
Affects Version/s: None
Component/s: Replication, Storage
Labels:
None

Backwards Compatibility:
Fully Compatible
Backport Requested:

v4.2
Sprint:
Repl 2019-12-02, Repl 2019-12-16
Case:
Confidence Status:
None
Work Order:
3
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

If you retry recoverFromOplogAsStandalone with takeUnstableCheckpointOnShutdown after running the parameter combination successfully, it should do nothing and let you shut down the node cleanly, taking another unstable checkpoint on shutdown that should really be a noop since we haven't done any work and the node already has an up-to-date unstable checkpoint. Thus the proposal is:

If a node starts up with both recoverFromOplogAsStandalone and takeUnstableCheckpointOnShutdown, and the storage engine does not have a stable checkpoint, we will check if all of the replication metadata indicates that the data files contain a fully up-to-date unstable checkpoint. If so we will go into read-only mode without doing any replication recovery (since it should not be needed) and allow automation to shutdown the node at its leisure like normal. Otherwise, if the replication metadata doesn't indicate that the unstable checkpoint is a safe one requiring no replication recovery, we will fassert as we do today. This would make it idempotent in the "success" case.

Assignee:: Judah Schvimer
Reporter:: Judah Schvimer
Participants:: Githook User, Judah Schvimer, Louisa Berger, Phil Jordan
Votes:: 0 Vote for this issue
Watchers:: 14 Start watching this issue

Created:: Nov 13 2019 07:59:44 PM UTC
Updated:: Oct 29 2023 10:14:58 PM UTC
Resolved:: Dec 03 2019 02:58:20 PM UTC
Confidence Status Last Update:: 14/Nov/19 8:08 PM

Details

Description

Attachments

Forms

Activity

People

Dates