-
Type:
Bug
-
Resolution: Unresolved
-
Priority:
Major - P3
-
None
-
Affects Version/s: None
-
Component/s: Cursors
-
None
-
Storage Engines - Foundations
-
None
-
None
Description:
When cursor->modify() is called with a read timestamp that falls between a tombstone and a subsequent re-insert on the same key, it incorrectly returns WT_NOTFOUND rather than WT_ROLLBACK.
Scenario:
- Key K is inserted at ts=T1.
- Key K is removed at ts=T2 (tombstone at T2).
- Key K is re-inserted at ts=T3 (T3 > T2).
- A write transaction with read_ts=R (T2 ≤ R < T3) calls cursor->modify() on key K.
At read_ts=R the tombstone at T2 is visible and the re-insert at T3 is not. The modify encounters the tombstone and returns WT_NOTFOUND.
Expected behaviour:
The re-insert at T3 is a committed but invisible update. __curfile_update_check should detect it and return WT_ROLLBACK, signalling the caller to retry with a higher read timestamp.
Actual behaviour:
WT_NOTFOUND is returned. When the in-memory update chain has been cleared (e.g. after a checkpoint), _curfile_update_check falls back to the on-page time window, which may only reflect the tombstone state and fail to detect the later invisible committed re-insert. _wti_cursor_valid then sees the visible tombstone and returns WT_NOTFOUND instead of WT_ROLLBACK.
Impact:
A WT_NOTFOUND return from modify is treated as a silent no-op by the caller rather than a conflict requiring retry. This means a write transaction can miss a write-write conflict and silently fail to apply a modification that should either succeed or be retried. For MongoDB, any workload that mixes reads and writes in the same transaction across a key that has been deleted and re-inserted within the visible timestamp window is at risk of silently dropping a modify, which could result in data corruption.
Related: WT-17247 (same class of bug for cursor->remove())
- is related to
-
WT-17247 Layered cursor writes on follower do not check stable cell's full time window
-
- Open
-