Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-2204

core dump during checkpoint

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • WT2.7.0
    • Affects Version/s: None
    • Component/s: None
    • Labels:
      None

      I just saw a core dump in format that concerns me, I've attached the CONFIG.

      Some caveats: this was in the wt-2182-split-race, with various edits, so this may not be real. I've also been running this particular CONFIG for most of a week, and this is the first time I've seen this trigger, so feel free to discard this ticket, it may not be worth chasing.

      Here's the information:

      #0  0x00000000004de394 in __sync_file (session=0x80351ea00, syncop=1)
          at src/btree/bt_sync.c:155
      #1  0x00000000004dde81 in __wt_cache_op (session=0x80351ea00, 
          ckptbase=0x80cd49c00, op=1) at src/btree/bt_sync.c:277
      #2  0x00000000004a1de5 in __checkpoint_worker (session=0x80351ea00, 
          cfg=0x7ffffddeedc0, is_checkpoint=true, need_tracking=true)
          at src/txn/txn_ckpt.c:1100
      #3  0x00000000004a0f3e in __wt_checkpoint (session=0x80351ea00, 
          cfg=0x7ffffddeedc0) at src/txn/txn_ckpt.c:1191
      #4  0x00000000004a35cc in __checkpoint_apply (session=0x80351ea00, 
          cfg=0x7ffffddeedc0, op=0x4a0e80 <__wt_checkpoint>)
          at src/txn/txn_ckpt.c:184
      #5  0x00000000004a065d in __txn_checkpoint (session=0x80351ea00, 
          cfg=0x7ffffddeedc0) at src/txn/txn_ckpt.c:507
      #6  0x000000000049f893 in __wt_txn_checkpoint (session=0x80351ea00, 
          cfg=0x7ffffddeedc0) at src/txn/txn_ckpt.c:668
      #7  0x000000000048b921 in __session_checkpoint (wt_session=0x80351ea00, 
          config=0x0) at src/session/session_api.c:1067
      #8  0x0000000000408334 in ops (arg=0x80347cdc0) at ops.c:367
      
      (gdb) l
      151				 * Mark the tree dirty: the checkpoint marked it clean
      152				 * and we can't skip future checkpoints until this page
      153				 * is written.
      154				 */
      155				if (!WT_PAGE_IS_INTERNAL(page) &&
      156				    F_ISSET(txn, WT_TXN_HAS_SNAPSHOT) &&
      157				    WT_TXNID_LT(txn->snap_max, mod->first_dirty_txn)) {
      158					__wt_page_modify_set(session, page);
      159					continue;
      (gdb) p page->type
      $6 = 4 '\004'
      (gdb) p/x txn->flags
      $7 = 0x4c
      (gdb) p mod
      $8 = (WT_PAGE_MODIFY *) 0x0
      (gdb) p page->modify
      $9 = (WT_PAGE_MODIFY *) 0x8101c3740
      (gdb) p txn->snap_max
      $10 = 880303
      (gdb) p page->modify->first_dirty_txn
      $11 = 882711
      
      (gdb) l 131,137
      131				page = walk->page;
      132				mod = page->modify;
      133	
      134				/* Skip clean pages. */
      135				if (!__wt_page_is_modified(page))
      136					continue;
      

      So, what happened is we took a copy of page->modify, and then called __wt_page_is_modified, and between those instructions, the page was modified.

      Obviously, we could take the copy of page->modify after calling __wt_page_is_modified, but I wanted to be sure nothing else is going on.

            Assignee:
            keith.bostic@mongodb.com Keith Bostic (Inactive)
            Reporter:
            keith.bostic@mongodb.com Keith Bostic (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: