-
Type:
Task
-
Resolution: Unresolved
-
Priority:
Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Summary
TLA+ spec + jstest for SERVER-120978: when setFCV is a no-op transition (already at target), the majority-commit wait uses a stale opTime, so the command can return before the FCV change is durably committed. Subtle but load-bearing in mixed-version fleets.
Files
- src/mongo/tla_plus/Replication/SetFCVNoOpMajorityWait/SetFCVNoOpMajorityWait.tla (289 lines) — FCV transition state machine. Actions: PersistFCVWrite, ReplicateFCV, AdvanceCommit, Stepdown, ElectNew, SetFCVNoOp, SetFCVWrite. Invariant SetFCVResultReflectsDurableState. Bug toggle AllowStaleOpTimeOnNoOp switches between the buggy clientLastOp wait and the fix's SystemLastOpTime pin (the setLastOpToSystemLastOpTime line at set_feature_compatibility_version_command.cpp:392).
- MCSetFCVNoOpMajorityWait. {tla,cfg}
- OWNERS.yml → 10gen/server-replication-reviewers
- README.md (275 words)
- jstests/replsets/set_fcv_noop_returns_before_durable.js (137 lines) — pauses replication via stopReplProducer, forces a replSetStepDown mid-FCV-write, then majority-reads the FCV doc on every node after a no-op setFCV reports ok:1.
TLC verdicts
- Green (AllowStaleOpTimeOnNoOp = FALSE): 1,740,696 states, 0 violations, 5–7s.
- Bug (=TRUE): invariant violated at depth 6 with 4-state trace — two PersistFCVWrites push SystemLastOpTime to 2 while clientLastOp stays at 0; SetFCVNoOp waits on 0 and returns.
- is related to
-
SERVER-120978 setFCV waits on wrong opTime to be majority committed if it performs no writes
-
- Open
-