Add cursor-oriented stress testing for layered (disaggregated) cursors

XMLWordPrintableJSON

    • Type: Task
    • Resolution: Unresolved
    • Priority: Major - P3
    • None
    • Affects Version/s: None
    • Component/s: Test Python
    • None
    • Storage Engines, Storage Engines - Foundations
    • 7.603
    • SE Foundations - 2026-06-23
    • 13

      Motivation — the testing gap

      The follower layered cursor (src/cursor/cur_layered.c) is one of the most complex pieces of the disaggregated-storage read path. Every read on a follower has to merge two constituents — an in-memory ingest table and a checkpoint-backed stable table — with tombstone shadowing, direction-tracked iteration, and checkpoint advances that can reshape the data underneath a live cursor. The correctness of a read depends on subtle interactions between cursor position, transaction/snapshot state, and the ingest/stable split.

      Today this component is covered almost entirely by Python integration tests that exercise natural, well-behaved cursor usage — typically resetting the cursor between operations on small, freshly-built tables. For a component this intricate that is not enough: we lack both the breadth (diverse, randomized scenarios) and the adversarial coverage needed to expose rare merge bugs. The cursor shapes most likely to break the merge are exactly the ones hand-written tests do not produce.

      Proposal — cursor-oriented stress testing

      Introduce a cursor-oriented stress test whose explicit purpose is to find subtle merge bugs by driving the layered cursor through the weirdest valid operation sequences we can generate, reproducibly from a single seed. The design emphasizes:

      • Long-lived positioned cursors. Long chains of operations (update / remove / next / prev / transaction switches) that change state but deliberately keep the cursor positioned; resetting the position is rare. Surviving mutation without losing or corrupting position is where the hard bugs hide.
      • Randomized transactions. begin / commit / rollback in the middle of a chain, read-timestamp views, isolation levels, and multi-session prepared-transaction conflicts.
      • Extreme scenario injections. At seeded points, do violent things to the table under the cursor: evict the entire ingest table, remove all content (both via tombstones and via truncate), bulk-grow the table, or advance a checkpoint mid-iteration.
      • Deterministic execution. Fixed random seed and 1-threaded execution so every discovered failure can be easily reproduced and investigation via gdb. 

      The idea that makes this tractable: every single operation is compared against an independent oracle — a plain, non-layered WiredTiger table that mirrors the same writes — checking key, value, and return code, with a leader-vs-follower cross-check on top. Because the oracle is itself WiredTiger, it gets read-timestamps, isolation, and prepared-transaction semantics right with no hand-rolled MVCC model. Any divergence is a candidate bug, and because everything is seed-driven a failure reproduces (and can later be shrunk to a minimal op sequence).

      Why this is worth doing

      Randomized, every-operation-compared stress testing is the most cost-effective way to flush out the rare, state-dependent merge bugs that natural integration tests structurally miss, while staying debuggable. It complements rather than replaces the existing integration tests, broadening our coverage of a high-risk component. The first implementation lives in the Python suite (reusing the existing leader/follower disagg helpers); the durable long-term home is test/model, which would add workload shrinking and speed.

      The main challenges discovered so far:

      • A significant amount of AI-generated code that requires careful review.
      • Existing issues being uncovered during the process, which need to be investigated and addressed along the way.

            Assignee:
            Ivan Kochin
            Reporter:
            Ivan Kochin
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated: