Disaggregated read corruption handler panics during verify (missing WT_BTREE_VERIFY check)

XMLWordPrintableJSON

    • Type: Task
    • Resolution: Duplicate
    • Priority: Major - P3
    • None
    • Affects Version/s: None
    • Component/s: Block Manager
    • Storage Engines - Persistence
    • 128.314
    • SE Persistence backlog
    • None

      Background:
      The corrupt handler in __block_disagg_read_multiple (src/block_disagg/block_disagg_read.c) panics the connection on a checksum/magic/version failure unless a read-corrupt session flag is set. Unlike the file-based block managers, it did not check whether the btree handle is a verify handle.

      Problem:
      The file-based block managers guard the fatal-read panic with a verify check:

      • block_cache/block_io.c: if (!F_ISSET(btree, WT_BTREE_VERIFY) && !WT_SESSION_READ_CORRUPT_OK(session)) <panic>
      • block/block_read.c: if (block->verify || WT_SESSION_READ_CORRUPT_OK(session)) return (WT_ERROR);

      The disaggregated handler only checked WT_SESSION_READ_CORRUPT_OK(session). The WT_SESSION::verify API path (used by mongod validate) sets neither read-corrupt session flag — read_corrupt is tracked in vs->read_corrupt, and only the wt CLI sets WT_SESSION_READ_SKIP_CORRUPT. So a corrupt disaggregated page encountered during API verify panicked the connection (SIGABRT) instead of returning a recoverable error, preventing verify from continuing past the bad page.

      The gap predates WT-17348, which only swapped WT_SESSION_QUIET_CORRUPT_FILE -> WT_SESSION_READ_CORRUPT_OK and preserved the verify guard in the other two block managers but not in disagg.

      Definition of Done:

      • The disagg corrupt-handler panic guard honors WT_BTREE_VERIFY, consistent with the file-based block managers.

      References: src/block_disagg/block_disagg_read.c:303; src/block_cache/block_io.c:37; src/block/block_read.c:279.

            Assignee:
            Etienne Petrel
            Reporter:
            Etienne Petrel
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated:
              Resolved: