Loading...

XML

Word

Printable

JSON

Type: Task
Resolution: Unresolved
Priority: Major - P3
Fix Version/s: None
Affects Version/s: None
Component/s: None
Labels:
- change-streams
- tla-plus

CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

The existing src/mongo/tla_plus/ corpus covers consensus (RaftMongo), reconfiguration, chunk migration, retryable txns moving with chunks, range deletions on secondaries, and ticketed concurrency. It does not cover any aspect of the change-stream cursor lifecycle.

This ticket proposes adding a new TLA+ specification at src/mongo/tla_plus/ChangeStreams/ResumeTokenArming/ that models the well-known lazy-cursor-arming hazard: when a driver implements db.coll.watch() lazily and does not pin the cursor with startAtOperationTime, the server-side cursor's open point becomes the time of the first getMore, not the time watch() returned to the caller. Any change event with a clusterTime in the open window [client_watch_returned, first_getMore_arrives_at_server) is silently dropped.

The proposed spec parameterizes the model with a boolean UseStartAt and shows by TLC model-checking that:

With UseStartAt = FALSE, the NoEventLoss invariant is violated by a short counterexample (~depth 9, 285 states).
With UseStartAt = TRUE, TLC completes clean (629 distinct states, depth 11).

This gives the change-streams team a formal artifact pinning the cursor-arming invariant that drivers must satisfy, and a machine-checked counterexample that drivers (or future server-side cursor refactors) can be tested against.

Scope

Pure addition to src/mongo/tla_plus/ChangeStreams/ResumeTokenArming/:
- ResumeTokenArming.tla (~326 lines)
- MCResumeTokenArming.tla (~31 lines)
- MCResumeTokenArming.cfg (~26 lines)
- OWNERS.yml
- README.md
No C++ source changes.
No test changes outside the TLA+ corpus.

Run

cd src/mongo/tla_plus
./download-tlc.sh    # one-time; pulls tla2tools.jar
./model-check.sh ChangeStreams/ResumeTokenArming

Default cfg ships with UseStartAt = FALSE to make the counterexample visible to the reviewer. Toggle to TRUE to verify the fixed-mode invariance.

Acceptance criteria

Files land under src/mongo/tla_plus/ChangeStreams/ResumeTokenArming/.
./model-check.sh ChangeStreams/ResumeTokenArming produces the documented UseStartAt=FALSE counterexample.
Owners file routes the spec to the appropriate team.
README is self-contained for a kernel engineer with no prior change-streams context.

Related work

No prior TLA+ spec covers change streams in this repo. Adjacent in spirit:

src/mongo/tla_plus/Replication/RaftMongo/ — uses the same actor/oplog idiom.
src/mongo/tla_plus/Sharding/RangeDeletionsSecondaryNodes/ — uses the same background-actor / consumer idiom.

Implementation status

Draft of all five files authored and TLC-verified locally. Forbidden-term review complete. PR to follow on this ticket.

Assignee:: Unassigned
Reporter:: Mehar Grewal
Participants:: Mehar Grewal
Votes:: 0 Vote for this issue
Watchers:: 1 Start watching this issue

Created:: May 14 2026 07:48:47 AM UTC
Updated:: May 14 2026 07:48:47 AM UTC

Details

Description

Scope

Run

Acceptance criteria

Related work

Implementation status

Attachments

Activity

People

Dates