-
Type:
Bug
-
Resolution: Unresolved
-
Priority:
Critical - P2
-
None
-
Affects Version/s: None
-
Component/s: None
-
Storage Engines, Storage Engines - Persistence
-
SE Persistence - 2025-07-04
-
None
We just spotted a failure in __wt_page_inmem:410, which resulted in a panic while updating the history store WiredTigerHS.wt:
__wt_page_inmem:410:encountered an illegal file format or internal value: 0x0
Stack trace:
src/mongo/util/assert_util.cpp:76:56: mongo::(anonymous namespace)::callAbort() src/mongo/util/assert_util.cpp:222:14: mongo::fassertFailedWithLocation(int, char const*, unsigned int) src/mongo/util/assert_util.h:344:34: mongo::fassertWithLocation(int, bool, char const*, unsigned int) src/mongo/db/storage/wiredtiger/wiredtiger_util.cpp:741:9: mongo::(anonymous namespace)::mdb_handle_error_with_startup_suppression(__wt_event_handler*, __wt_session*, int, char const*) (.cold) src/third_party/wiredtiger/src/support/err.c:450:15: __eventv src/third_party/wiredtiger/src/support/err.c:552:5: __wt_panic_func src/third_party/wiredtiger/src/btree/bt_page.c:410:17: __wt_page_inmem src/third_party/wiredtiger/src/btree/bt_read.c:197:5: __page_read src/third_party/wiredtiger/src/btree/bt_read.c:307:13: __wt_page_in_func src/third_party/wiredtiger/src/include/btree_inline.h:2238:11: __wt_page_swap_func.part.0 src/third_party/wiredtiger/src/btree/bt_walk.c:460:19: __wt_page_swap_func src/third_party/wiredtiger/src/btree/bt_walk.c:460:19: __tree_walk_internal src/third_party/wiredtiger/src/btree/bt_curnext.c:930:13: __wt_btcur_next src/third_party/wiredtiger/src/cursor/cur_file.c:188:5: __curfile_next src/third_party/wiredtiger/src/cursor/cur_hs.c:130:5: __curhs_file_cursor_next src/third_party/wiredtiger/src/cursor/cur_hs.c:245:5: __curhs_next src/third_party/wiredtiger/src/history/hs_rec.c:191:13: __hs_insert_record src/third_party/wiredtiger/src/history/hs_rec.c:704:17: __wt_hs_insert_updates src/third_party/wiredtiger/src/reconcile/rec_write.c:2693:13: __rec_hs_wrapup src/third_party/wiredtiger/src/reconcile/rec_write.c:2428:15: __rec_write_wrapup src/third_party/wiredtiger/src/reconcile/rec_write.c:322:5: __reconcile src/third_party/wiredtiger/src/reconcile/rec_write.c:95:11: __wt_reconcile src/third_party/wiredtiger/src/evict/evict_page.c:888:9: __evict_reconcile src/third_party/wiredtiger/src/evict/evict_page.c:272:9: __wt_evict src/third_party/wiredtiger/src/evict/evict_lru.c:2403:5: __evict_page src/third_party/wiredtiger/src/evict/evict_lru.c:1163:20: __evict_lru_pages src/third_party/wiredtiger/src/evict/evict_lru.c:340:9: __wt_evict_thread_run src/third_party/wiredtiger/src/support/thread_group.c:31:9: __thread_run
This happened on MongoDB 7.0.20. Please refer to the linked ticket for more details about this cluster.
It seems that WT read a page with an invalid dsk->type in the disk image, but the disk image itself must have passed checksum validation. It is thus possible that WT wrote the page incorrectly to begin with. Another possibility is memory corruption, e.g., if something wrote over the disk image just before writing it, or after reading it (and passing the checksum validation).
Raising this to P2 for visibility until we can get this triaged, as this could be indicative of data corruption.