-
Type:
Task
-
Resolution: Unresolved
-
Priority:
Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Summary
TLA+ specification + regression jstest pinning the oplog visibility thread lifecycle race in which getRecordStore() side-effects start/stop on WiredTigerOplogManager during durable recovery (SERVER-122142).
Files
- src/mongo/tla_plus/Replication/OplogVisibilityThread/OplogVisibilityThread.tla (285 lines) — N concurrent readers call getRecordStore(); readers drive BeginStart / FinishStart / BeginStop / FinishStop on a single visibility thread, with a pinned set tracking readers that hold a record-store pin. Constant AllowConcurrentStartStop toggles the bug.
- MCOplogVisibilityThread.tla / .cfg (green) + MCOplogVisibilityThread_bug.cfg (bug)
- OWNERS.yml → 10gen/server-storage-execution
- README.md
- jstests/replsets/oplog_visibility_concurrent_start_stop.js (150 lines)
Invariants
- AtMostOneStartStopInFlight
- NoThreadTeardownWhileReaderPinned
- PinnedReaderSeesRunningThread
- EpochMonotonic
- TypeOK
TLC verdicts
- Green: No error has been found (2726 states generated / 1129 distinct)
- Bug: Invariant AtMostOneStartStopInFlight is violated (counterexample at 248 generated / 105 distinct)
jstest
Pauses JournalFlusher, drives sibling-collection DDL in local plus repeated replSetResizeOplog, hammers the oplog with 8 concurrent reverse-cursor readers, asserts every reader iteration completes without CursorNotFound / OperationFailed surfaced from a torn-down visibility thread.
- is related to
-
SERVER-122142 Oplog visibility thread is susceptible to concurrent start/stop
-
- In Progress
-