-
Type:
Task
-
Resolution: Unresolved
-
Priority:
Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Summary
TLA+ spec + jstest for SERVER-126266 — possible deadlock between stepdown and primary-driven index build (PDIB). Load-bearing for the 9.0 PDIB release.
Files
- src/mongo/tla_plus/IndexBuilds/StepdownPDIBDeadlock/StepdownPDIBDeadlock.tla (378 lines) — three abstract threads (PDIB, Stepdown, BgSync) contend on one RSTL (fair queue) plus a commit-quorum vote set whose arrival is gated on BgSync holding IX. PDIB walks idle → scanning → voting → committing → done/aborted. WaitsFor captures RSTL-queue blocking + PDIB-waiting-on-BgSync commit-quorum starvation; HasCycle enumerates length-2 and length-3 cycles directly.
- MC + green cfg + bug cfg with deadlock check on
- OWNERS.yml → 10gen/server-replication-reviewers
- README.md (298 words)
- jstests/replsets/stepdown_during_pdib_no_deadlock.js (148 lines) — uses hangIndexBuildBeforeSignalPrimaryForCommitReadiness to park PDIB, fires non-force replSetStepDown, asserts completion under 60s, verifies the rs is not wedged.
Invariants
TypeOK, RSTLWellFormed, DeadlockFreedom (no wait-for cycle), EveryStepdownCompletes (liveness, ~>).
TLC verdicts
- Green: 268 states / 133 distinct / depth 12 — no error.
- Bug: Error: Deadlock reached at depth 5 — counterexample PDIBStart → StepdownStart → PDIBFinishScan → BgSyncEnqueue then no enabled action, matching the SERVER-126266 wait-for cycle exactly.
- is related to
-
SERVER-126266 resolve possible deadlock between stepdown and index build
-
- Needs Scheduling
-