-
Type:
Task
-
Resolution: Unresolved
-
Priority:
Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
The existing src/mongo/tla_plus/ corpus covers consensus (RaftMongo), reconfiguration, chunk migration, retryable txns moving with chunks, range deletions on secondaries, and ticketed concurrency. It does not cover any aspect of the change-stream cursor lifecycle.
This ticket proposes adding a new TLA+ specification at src/mongo/tla_plus/ChangeStreams/ResumeTokenArming/ that models the well-known lazy-cursor-arming hazard: when a driver implements db.coll.watch() lazily and does not pin the cursor with startAtOperationTime, the server-side cursor's open point becomes the time of the first getMore, not the time watch() returned to the caller. Any change event with a clusterTime in the open window [client_watch_returned, first_getMore_arrives_at_server) is silently dropped.
The proposed spec parameterizes the model with a boolean UseStartAt and shows by TLC model-checking that:
- With UseStartAt = FALSE, the NoEventLoss invariant is violated by a short counterexample (~depth 9, 285 states).
- With UseStartAt = TRUE, TLC completes clean (629 distinct states, depth 11).
This gives the change-streams team a formal artifact pinning the cursor-arming invariant that drivers must satisfy, and a machine-checked counterexample that drivers (or future server-side cursor refactors) can be tested against.
Scope
- Pure addition to src/mongo/tla_plus/ChangeStreams/ResumeTokenArming/:
- ResumeTokenArming.tla (~326 lines)
- MCResumeTokenArming.tla (~31 lines)
- MCResumeTokenArming.cfg (~26 lines)
- OWNERS.yml
- README.md
- No C++ source changes.
- No test changes outside the TLA+ corpus.
Run
cd src/mongo/tla_plus ./download-tlc.sh # one-time; pulls tla2tools.jar ./model-check.sh ChangeStreams/ResumeTokenArming
Default cfg ships with UseStartAt = FALSE to make the counterexample visible to the reviewer. Toggle to TRUE to verify the fixed-mode invariance.
Acceptance criteria
- Files land under src/mongo/tla_plus/ChangeStreams/ResumeTokenArming/.
- ./model-check.sh ChangeStreams/ResumeTokenArming produces the documented UseStartAt=FALSE counterexample.
- Owners file routes the spec to the appropriate team.
- README is self-contained for a kernel engineer with no prior change-streams context.
Related work
No prior TLA+ spec covers change streams in this repo. Adjacent in spirit:
- src/mongo/tla_plus/Replication/RaftMongo/ — uses the same actor/oplog idiom.
- src/mongo/tla_plus/Sharding/RangeDeletionsSecondaryNodes/ — uses the same background-actor / consumer idiom.
Implementation status
Draft of all five files authored and TLC-verified locally. Forbidden-term review complete. PR to follow on this ticket.