Failure in __wt_page_inmem: "encountered an illegal file format or internal value: 0x0"

XMLWordPrintableJSON

    • Type: Bug
    • Resolution: Unresolved
    • Priority: Critical - P2
    • None
    • Affects Version/s: None
    • Component/s: None
    • Storage Engines, Storage Engines - Persistence
    • SE Persistence - 2025-07-04
    • None

      We just spotted a failure in __wt_page_inmem:410, which resulted in a panic while updating the history store WiredTigerHS.wt:

      __wt_page_inmem:410:encountered an illegal file format or internal value: 0x0
      

      Stack trace:

      src/mongo/util/assert_util.cpp:76:56: mongo::(anonymous namespace)::callAbort()
      src/mongo/util/assert_util.cpp:222:14: mongo::fassertFailedWithLocation(int, char const*, unsigned int)
      src/mongo/util/assert_util.h:344:34: mongo::fassertWithLocation(int, bool, char const*, unsigned int)
      src/mongo/db/storage/wiredtiger/wiredtiger_util.cpp:741:9: mongo::(anonymous namespace)::mdb_handle_error_with_startup_suppression(__wt_event_handler*, __wt_session*, int, char const*) (.cold)
      src/third_party/wiredtiger/src/support/err.c:450:15: __eventv
      src/third_party/wiredtiger/src/support/err.c:552:5: __wt_panic_func
      src/third_party/wiredtiger/src/btree/bt_page.c:410:17: __wt_page_inmem
      src/third_party/wiredtiger/src/btree/bt_read.c:197:5: __page_read
      src/third_party/wiredtiger/src/btree/bt_read.c:307:13: __wt_page_in_func
      src/third_party/wiredtiger/src/include/btree_inline.h:2238:11: __wt_page_swap_func.part.0
      src/third_party/wiredtiger/src/btree/bt_walk.c:460:19: __wt_page_swap_func
      src/third_party/wiredtiger/src/btree/bt_walk.c:460:19: __tree_walk_internal
      src/third_party/wiredtiger/src/btree/bt_curnext.c:930:13: __wt_btcur_next
      src/third_party/wiredtiger/src/cursor/cur_file.c:188:5: __curfile_next
      src/third_party/wiredtiger/src/cursor/cur_hs.c:130:5: __curhs_file_cursor_next
      src/third_party/wiredtiger/src/cursor/cur_hs.c:245:5: __curhs_next
      src/third_party/wiredtiger/src/history/hs_rec.c:191:13: __hs_insert_record
      src/third_party/wiredtiger/src/history/hs_rec.c:704:17: __wt_hs_insert_updates
      src/third_party/wiredtiger/src/reconcile/rec_write.c:2693:13: __rec_hs_wrapup
      src/third_party/wiredtiger/src/reconcile/rec_write.c:2428:15: __rec_write_wrapup
      src/third_party/wiredtiger/src/reconcile/rec_write.c:322:5: __reconcile
      src/third_party/wiredtiger/src/reconcile/rec_write.c:95:11: __wt_reconcile
      src/third_party/wiredtiger/src/evict/evict_page.c:888:9: __evict_reconcile
      src/third_party/wiredtiger/src/evict/evict_page.c:272:9: __wt_evict
      src/third_party/wiredtiger/src/evict/evict_lru.c:2403:5: __evict_page
      src/third_party/wiredtiger/src/evict/evict_lru.c:1163:20: __evict_lru_pages
      src/third_party/wiredtiger/src/evict/evict_lru.c:340:9: __wt_evict_thread_run
      src/third_party/wiredtiger/src/support/thread_group.c:31:9: __thread_run
      

      This happened on MongoDB 7.0.20. Please refer to the linked ticket for more details about this cluster.

      It seems that WT read a page with an invalid dsk->type in the disk image, but the disk image itself must have passed checksum validation. It is thus possible that WT wrote the page incorrectly to begin with. Another possibility is memory corruption, e.g., if something wrote over the disk image just before writing it, or after reading it (and passing the checksum validation).

      Raising this to P2 for visibility until we can get this triaged, as this could be indicative of data corruption.

            Assignee:
            Yury Ershov
            Reporter:
            Peter Macko
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated: