Layered cursor debug dump is a stub: no FAIL.pagedump files produced for layered tables

XMLWordPrintableJSON

    • Type: Bug
    • Resolution: Fixed
    • Priority: Major - P3
    • WT12.0.0, 9.0.0-rc0
    • Affects Version/s: None
    • Component/s: Btree
    • None
    • Storage Engines - Transactions
    • 112.548
    • None
    • None

      Summary

      _wt_debug_layered_cursor_page (src/cursor/cur_layered.c) and wt_debug_layered_cursor_tree_hs are stubs that log "unsupported cursor type for debug dump" and return 0 without writing any output. They are reached from wt_debug_cursor_page / wt_debug_cursor_tree_hs whenever the cursor URI starts with table:, which is the case for layered tables. As a result, test/format's mirror-mismatch diagnostic path (cursor_dump_page_wt_debug_cursor_page) produces no FAIL.pagedump.* files for the disagg/layered configuration, blinding triage of mirror data-mismatch failures.

      Observed example

      While reproducing WT-17304 (format-stress-test-disagg-switch data mismatch), the test stderr contained:

      mirror error: base cursor (table 1): dumping to .../RUNDIR.12/FAIL.pagedump.1
      mirror error: table cursor (table 2): dumping to .../RUNDIR.12/FAIL.pagedump.2
      mirror error: base key number 362848 in table 3: dumping to .../RUNDIR.12/FAIL.pagedump.3
      

      …but no FAIL.pagedump.* files were created under RUNDIR. The same applies to history-store dumps via __wt_debug_cursor_tree_hs.

      Root cause

      In src/cursor/cur_std.c, the dispatch helper __cursor_debug_dispatch routes table:* URIs to the layered debug function. The layered debug functions never wrote a file; they only emitted a verbose-debug log.

      __wt_debug_layered_cursor_page(void *cursor_arg, const char *ofile) {
          WT_UNUSED(ofile);
          __wt_verbose_debug1(..., "%s: unsupported cursor type for debug dump", cursor->uri);
          return (0);
      }
      

      Proposed fix

      A layered cursor has two underlying btree cursors (ingest_cursor and stable_cursor). The debug entry points should walk each positioned constituent and emit a file per constituent:

      • <ofile>.ingest from ingest_cursor
      • <ofile>.stable from stable_cursor

      Both should delegate to the existing _wt_debug_btree_cursor_page / _wt_debug_btree_cursor_tree_hs. If neither constituent is positioned (both cbt->ref == NULL), emit a single verbose-debug line and return 0 (no file).

      test/format's cursor_dump_page message should also be updated to indicate that for layered cursors the output filenames are suffixed with the constituent name.

      A local patch is ready and has been tested manually; happy to attach or open a PR.

      Related

      • WT-17304format-stress-test-disagg-switch data mismatch (this bug blocked triage of WT-17304's repro because no page dumps were available to inspect).

            Assignee:
            Chenhao Qu
            Reporter:
            Chenhao Qu
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: