-
Type:
Bug
-
Resolution: Unresolved
-
Priority:
Major - P3
-
None
-
Affects Version/s: None
-
Component/s: Checkpoints, Reconciliation
-
None
-
Storage Engines - Persistence
-
63.806
-
SE Persistence backlog
-
None
Issue Summary
A SIGABRT was observed during the WiredTiger format stress test, specifically in the disagg_switch_roles test, where the process terminated with the message reconciliation failed after building the disk image. The failure occurs during the checkpoint operation, as seen in the stack trace.
Context
- The failure log can be found here: Spruce Task Log
- The stack trace shows the abort is triggered in _wt_abort due to a panic in _reconcile at line 452 of rec_write.c:
#3 0x00007ff8d6cb2390 in __wt_panic_func (session=session@entry=0x13f2bfeb26f0, error=error@entry=22, func=func@entry=0x7ff8d6dc3320 <__PRETTY_FUNCTION__.66> "__reconcile", line=line@entry=452, category=category@entry=WT_VERB_DEFAULT, fmt=fmt@entry=0x7ff8d6d68a58 "reconciliation failed after building the disk image") at /data/mci/65f95453e6ee28814da89265fd51ba40/wiredtiger/src/support/err.c:633 #4 0x00007ff8d6c5129c in __reconcile (session=session@entry=0x13f2bfeb26f0, ref=ref@entry=0x13f2b9bfe780, salvage=salvage@entry=0x0, flags=flags@entry=132, page_lockedp=page_lockedp@entry=0x7ffd75cf73df) at /data/mci/65f95453e6ee28814da89265fd51ba40/wiredtiger/src/reconcile/rec_write.c:452
- etienne.petrel@mongodb.com noted that this error has occurred in the past, but perhaps not recently.
Proposed Solution
- Investigate the root cause of the reconciliation failure in __reconcile during the checkpoint operation in the format stress test.
- Review recent changes to reconciliation and checkpoint code paths for potential regressions.
- If this is a known intermittent issue, document the frequency and any known workarounds or fixes.
Original Slack thread: Slack Thread
This ticket was generated by AI from a Slack thread.