-
Type:
Sub-task
-
Resolution: Won't Do
-
Priority:
Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
None
-
Storage Execution
-
Storage Execution 2026-05-11, Storage Execution 2026-05-25, Storage Execution 2026-06-08, Storage Execution 2026-06-22
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Overview
Add a new storage-engine method, waitForOplogVisibilityToAdvancePast(opCtx, ts), that blocks until the oplog visibility timestamp has advanced strictly past a caller-provided timestamp. Wakes early on engine shutdown, on rollback (visibility moving strictly backwards), or on operation-context interrupt. Intended for tailing readers that track their own watermark and want to be woken precisely when there is something new for them to read without crossing oplog holes.
Background
The existing StorageEngine::waitForAllEarlierOplogWritesToBeVisible waits relative to the current latest oplog timestamp: each call opens a reverse cursor on local.oplog.rs to find the tail, then blocks until visibility catches up to that tail. This shape is awkward for a continuous tailing reader, which already knows its own position and only needs to be woken when that position advances. Concretely:
- The reverse-cursor lookup is unnecessary work for a tailer.
- The semantics ("wait for everything that existed at call time") loop internally under sustained write load, since the tail keeps moving away from the captured snapshot.
- There is no clean signal for "visibility moved past my saved ts".
The new primitive is the building block that makes a single long-lived scanner cursor practical without burning CPU on inserts that do not advance visibility.