Loading...

XML

Word

Printable

JSON

Type: Sub-task
Resolution: Fixed
Priority: Major - P3
Fix Version/s: WT12.0.0, 9.0.0-rc0
Affects Version/s: None
Component/s: Tools
Security Level: Public (Available to anyone on the web)
Labels:
None

Assigned Teams:

Storage Engines - Persistence
Total Hours with Assigned Team:
2,115.479
Sprint:
SE Persistence backlog
Story Points:
None

The wt verify subcommand accepts a read_corrupt option to read past corrupt pages. We should extend this to dump and other subcommands that are useful for diagnostic purposes.

Background

wt verify accepts a flag that passes read_corrupt=true to session>verify(). This controls two behaviours in the verify path:

Btree traversal continuation (bt_vrfy.c): when __wt_page_in fails on a child ref, verify logs the error and skips the subtree instead of aborting.
Block I/O suppression: the verify path sets WT_BTREE_VERIFY on the btree, which causes block_io.c and related block readers to return an error code rather than panicking on checksum/decryption failures.

Other diagnostic commands (dump, read, stat, list, page, printlog) have no equivalent. When any of these encounters a corrupt page, the cursor iteration fails and the command aborts. This makes the wt CLI less useful for inspecting partially corrupt or disaggregated databases.

Goal

All read-oriented wt subcommands should be able to continue past corrupt pages, printing or skipping what they can recover. Subcommands in scope: dump, read, stat, list, page, printlog.

Implementation Considerations

There are two independent layers to address:

Layer 1 — Block I/O: WT_SESSION_QUIET_CORRUPT_FILE is the session flag checked in block_read.c, block_io.c, and block_disagg_read.c to suppress panics and return an error code instead. Setting this flag on the session before running a diagnostic command prevents crashes on corrupt block reads.

Layer 2 — Cursor iteration: Even with Layer 1 in place, a corrupt page causes cursor->next() / cursor->prev() to return an error, which current iteration loops (e.g. dump_all_records in util_dump.c) treat as fatal. Each command's iteration loop needs to distinguish between WT_NOTFOUND (end of data), WT_ERROR/EIO in read-corrupt mode (skip and continue), and other errors (abort). This mirrors the pattern in bt_vrfy.c where verify's traversal catches __wt_page_in errors and continues.

Implementation Options

Option A — Global CLI flag

Add a flag to util_main.c (e.g. a new free letter at the global level). After the session is opened, if the flag is set, call F_SET((WT_SESSION_IMPL *)session, WT_SESSION_QUIET_CORRUPT_FILE) before dispatching to the subcommand. Each targeted subcommand still needs its iteration loop updated to continue past errors (Layer 2).

Pro: single point of change for the session flag; automatically applies to any current or future subcommand.
Con: -c is already a per-command flag in wt verify, so a different letter must be chosen at the global level. Does not remove the need to update each command's iteration loop.

Option B — Per-subcommand flag (matches verify's existing `-c` pattern)

Add -c to each targeted subcommand. Each command sets WT_SESSION_QUIET_CORRUPT_FILE on the session and updates its iteration loop to continue past errors.

Pro: consistent with wt verify -c; each command is self-contained.
Con: changes required in 6+ files (util_dump.c, util_read.c, util_stat.c, util_list.c, util_page.c, util_printlog.c).

Test Plan

Write a Python suite test that creates a table, corrupts a page on disk, then verifies that wt dump -c, wt read -c, and wt stat -c produce partial output and exit non-zero rather than crashing or producing no output.
Confirm wt verify -c behaviour is unchanged.
Confirm that without -c, commands still abort on the first corrupt page.

related to

WT-17946 Disaggregated read corruption handler panics during verify (missing WT_BTREE_VERIFY check)

Closed

WT-17947 Disaggregated block read corruption logging ignores read_corrupt/verify modes

Closed

Assignee:: Dylan Liang
Reporter:: Sean Watt
Votes:: 0 Vote for this issue
Watchers:: 2 Start watching this issue

Created:: Apr 29 2026 04:38:28 AM UTC
Updated:: Jun 30 2026 07:57:23 AM UTC
Resolved:: Jun 26 2026 05:35:39 AM UTC

Details

Description

Background

Goal

Implementation Considerations

Implementation Options

Option A — Global CLI flag

Option B — Per-subcommand flag (matches verify's existing -c pattern)

Test Plan

Attachments

Issue Links

Activity

People

Dates

Option B — Per-subcommand flag (matches verify's existing `-c` pattern)