For disagg, create a way to debug ingest table errors

XMLWordPrintableJSON

    • Type: Improvement
    • Resolution: Unresolved
    • Priority: Major - P3
    • None
    • Affects Version/s: None
    • Component/s: Test Csuite
    • None
    • Storage Engines - Foundations
    • SE Foundations - 2026-03-13, SE Foundations - 2026-03-27
    • 5

      On a follower, the ingest table (theoretically) only exists in memory. This makes it hard, when we notice that something is wrong, to debug the data in the system.  Consider a test format run in "multi-mode" - using a predictable set of operations on both leader and follower processes.  While continuing the operations, the leader creates a checkpoint, and follower picks it up.  Both sides stop after applying N operations and want to compare.

      The leader and follower can both do in memory scans of their entire dataset and produce a checksum. Suppose they are different - how do we narrow down the differences to a single differing key?  (If the differences are many, the first differing key will do).

      This is currently an unsolved problem. We could checkpoint both processes and then have a process with multiple WT connections to compare files (tools/wt_cmp_dir is such a script). Checkpointing puts everything into disagg or on disk, right?  Oops, the ingest table has no disk presence.  So of course, lots of stuff might appear different (that shouldn't be), or if we stopped at the checkpoint we just picked up, both sides would appear the same (even if the ingest table was different).

      We need a way to identify and examine the differences in layered tables from two different active connections.

            Assignee:
            Donald Anderson
            Reporter:
            Donald Anderson
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated: