-
Type:
Task
-
Resolution: Duplicate
-
Priority:
Major - P3
-
None
-
Affects Version/s: None
-
Component/s: Block Manager
-
Storage Engines - Persistence
-
128.314
-
SE Persistence backlog
-
None
Background:
The corrupt handler in __block_disagg_read_multiple (src/block_disagg/block_disagg_read.c) panics the connection on a checksum/magic/version failure unless a read-corrupt session flag is set. Unlike the file-based block managers, it did not check whether the btree handle is a verify handle.
Problem:
The file-based block managers guard the fatal-read panic with a verify check:
- block_cache/block_io.c: if (!F_ISSET(btree, WT_BTREE_VERIFY) && !WT_SESSION_READ_CORRUPT_OK(session)) <panic>
- block/block_read.c: if (block->verify || WT_SESSION_READ_CORRUPT_OK(session)) return (WT_ERROR);
The disaggregated handler only checked WT_SESSION_READ_CORRUPT_OK(session). The WT_SESSION::verify API path (used by mongod validate) sets neither read-corrupt session flag — read_corrupt is tracked in vs->read_corrupt, and only the wt CLI sets WT_SESSION_READ_SKIP_CORRUPT. So a corrupt disaggregated page encountered during API verify panicked the connection (SIGABRT) instead of returning a recoverable error, preventing verify from continuing past the bad page.
The gap predates WT-17348, which only swapped WT_SESSION_QUIET_CORRUPT_FILE -> WT_SESSION_READ_CORRUPT_OK and preserved the verify guard in the other two block managers but not in disagg.
Definition of Done:
- The disagg corrupt-handler panic guard honors WT_BTREE_VERIFY, consistent with the file-based block managers.
References: src/block_disagg/block_disagg_read.c:303; src/block_cache/block_io.c:37; src/block/block_read.c:279.
- is related to
-
WT-17348 Generalise verify read_corrupt config to all modes in wt util
-
- Closed
-