Loading...

XML

Word

Printable

JSON

Type: Task
Resolution: Fixed
Priority: Major - P3
Fix Version/s: WT12.0.0, 9.0.0-rc0
Affects Version/s: None
Component/s: Test Python
Labels:
None

Assigned Teams:

Storage Engines, Storage Engines - Foundations
Total Hours with Assigned Team:
580.023
Epic Link:
SPM-4736
Sprint:
SE Foundations - 2026-06-23
Story Points:
13

Motivation — the testing gap

The follower layered cursor (src/cursor/cur_layered.c) is one of the most complex pieces of the disaggregated-storage read path. Every read on a follower has to merge two constituents — an in-memory ingest table and a checkpoint-backed stable table — with tombstone shadowing, direction-tracked iteration, and checkpoint advances that can reshape the data underneath a live cursor. The correctness of a read depends on subtle interactions between cursor position, transaction/snapshot state, and the ingest/stable split.

Today this component is covered almost entirely by Python integration tests that exercise natural, well-behaved cursor usage — typically resetting the cursor between operations on small, freshly-built tables. For a component this intricate that is not enough: we lack both the breadth (diverse, randomized scenarios) and the adversarial coverage needed to expose rare merge bugs. The cursor shapes most likely to break the merge are exactly the ones hand-written tests do not produce.

Proposal — cursor-oriented stress testing

Introduce a cursor-oriented stress test whose explicit purpose is to find subtle merge bugs by driving the layered cursor through the weirdest valid operation sequences we can generate, reproducibly from a single seed. The design emphasizes:

Long-lived positioned cursors. Long chains of operations (update / remove / next / prev / transaction switches) that change state but deliberately keep the cursor positioned; resetting the position is rare. Surviving mutation without losing or corrupting position is where the hard bugs hide.
Randomized transactions. begin / commit / rollback in the middle of a chain, read-timestamp views, isolation levels, and multi-session prepared-transaction conflicts.
Extreme scenario injections. At seeded points, do violent things to the table under the cursor: evict the entire ingest table, remove all content (both via tombstones and via truncate), bulk-grow the table, or advance a checkpoint mid-iteration.
Deterministic execution. Fixed random seed and 1-threaded execution so every discovered failure can be easily reproduced and investigation via gdb.

The idea that makes this tractable: every single operation is compared against an independent oracle — a plain, non-layered WiredTiger table that mirrors the same writes — checking key, value, and return code, with a leader-vs-follower cross-check on top. Because the oracle is itself WiredTiger, it gets read-timestamps, isolation, and prepared-transaction semantics right with no hand-rolled MVCC model. Any divergence is a candidate bug, and because everything is seed-driven a failure reproduces (and can later be shrunk to a minimal op sequence).

Why this is worth doing

Randomized, every-operation-compared stress testing is the most cost-effective way to flush out the rare, state-dependent merge bugs that natural integration tests structurally miss, while staying debuggable. It complements rather than replaces the existing integration tests, broadening our coverage of a high-risk component. The first implementation lives in the Python suite (reusing the existing leader/follower disagg helpers); the durable long-term home is test/model, which would add workload shrinking and speed.

The main challenges discovered so far:

A significant amount of AI-generated code that requires careful review.
Existing issues being uncovered during the process, which need to be investigated and addressed along the way.

is duplicated by

WT-17442 Layered cursors rework - testing

Closed

related to

WT-17838 Layered cursors stress testing - follow up

Open

Assignee:: Ivan Kochin
Reporter:: Ivan Kochin
Votes:: 0 Vote for this issue
Watchers:: 1 Start watching this issue

Created:: Jun 09 2026 02:53:35 AM UTC
Updated:: Jun 24 2026 07:00:21 AM UTC
Resolved:: Jun 19 2026 06:52:41 AM UTC

Details

Description

Motivation — the testing gap

Proposal — cursor-oriented stress testing

Why this is worth doing

Attachments

Issue Links

Activity

People

Dates