-
Type:
Task
-
Resolution: Done
-
Priority:
Minor - P4
-
Affects Version/s: None
-
Component/s: Test Format
-
None
-
Storage Engines - Transactions
-
115.952
-
SE Transactions - 2026-06-05
-
1
Problem
When a mirror mismatch is detected in test/format, cursor_dump_page calls __wt_debug_cursor_page directly without holding page locks. In disagg mode=switch, the cursor's ref can be evicted between cursor positioning and when the debug function dereferences it, causing a SIGSEGV. The process dies before writing anything to the FAIL.pagedump.* file, so no diagnostic information is captured.
The failure in WT-17709 shows this: the log prints "dumping to FAIL.pagedump.1" (the message is printed before the actual I/O begins), then the process exits with FORMAT_FAILED_TO_KILL_PARENT_THREAD (exit 117), leaving an absent dump file. The second cursor dump is never attempted either.
Fix
Install SIGSEGV/SIGBUS handlers around the debug call using sigaction; use sigsetjmp/siglongjmp to recover if the dump crashes, allowing the caller to continue collecting the second cursor dump and other diagnostics. The dump file uses line-buffered I/O so all complete lines written before the crash are available for triage.
Also replaces sys/wait.h with setjmp.h in format.h.