Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-2195

Hang closing file after truncate

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • WT2.7.0
    • Affects Version/s: None
    • Component/s: None
    • None

      The Jenkins python long unit test hung:
      http://build.wiredtiger.com:8080/job/wiredtiger-test-unit-long/904/

      The call stack is:

      (gdb) where
      #0  __wt_tree_walk (session=0x4c478d0, refp=0x7ffd900e2608, walkcntp=0x0,
          flags=13) at ../src/btree/bt_walk.c:364
      #1  0x00007f3e539c1fda in __wt_evict_file (session=0x4c478d0, syncop=2)
          at ../src/evict/evict_file.c:34
      #2  0x00007f3e5397238a in __wt_cache_op (session=0x4c478d0,
          ckptbase=0x131935a0, op=2) at ../src/btree/bt_sync.c:262
      #3  0x00007f3e53a32272 in __checkpoint_worker (session=0x4c478d0, cfg=0x0,
          is_checkpoint=false, need_tracking=false) at ../src/txn/txn_ckpt.c:1102
      #4  0x00007f3e53a3287b in __wt_checkpoint_close (session=0x4c478d0,
          final=false) at ../src/txn/txn_ckpt.c:1271
      #5  0x00007f3e539931c5 in __wt_conn_btree_sync_and_close (session=0x4c478d0,
          final=false, force=false) at ../src/conn/conn_dhandle.c:199
      #6  0x00007f3e5399400c in __wt_conn_dhandle_close_all (session=0x4c478d0,
          uri=0x7f3e50dbc08c "file:test_truncate", force=false)
          at ../src/conn/conn_dhandle.c:570
      #7  0x00007f3e53a0d5ba in __drop_file (session=0x4c478d0,
          uri=0x7f3e50dbc08c "file:test_truncate", force=false, cfg=0x7ffd900e29f0)
          at ../src/schema/schema_drop.c:32
      #8  0x00007f3e53a0dc2c in __wt_schema_drop (session=0x4c478d0,
          uri=0x7f3e50dbc08c "file:test_truncate", cfg=0x7ffd900e29f0)
          at ../src/schema/schema_drop.c:180
      #9  0x00007f3e53a1cc32 in __wt_session_drop (session=0x4c478d0,
          uri=0x7f3e50dbc08c "file:test_truncate", cfg=0x7ffd900e29f0)
          at ../src/session/session_api.c:587
      #10 0x00007f3e53a1ce1f in __session_drop (wt_session=0x4c478d0,
          uri=0x7f3e50dbc08c "file:test_truncate", config=0x0)
          at ../src/session/session_api.c:609
      #11 0x00007f3e53c87880 in _wrap_Session_drop (self=<optimized out>,
          args=<optimized out>) at wiredtiger_wrap.c:6355
      

      There are no other active threads. The issue is that the __wt_tree_walk code is constantly re-starting it's search due to the root page containing a number of refs in the WT_REF_SPLIT state:

      (gdb) p *$14
      $15 = {entries = 11, deleted_entries = 2, index = 0x2ea6960}
      (gdb) p *$14->index[0]
      $18 = {page = 0x0, home = 0x290b0d0, pindex_hint = 0, state = 5, addr = 0x0,
        key = {recno = 4294967379, ikey = 0x100000053}, page_del = 0x0}
      (gdb) p *$14->index[1]
      $20 = {page = 0x0, home = 0x290b0d0, pindex_hint = 1, state = 5,
        addr = 0x4cb5525, key = {recno = 4294967401, ikey = 0x100000069},
        page_del = 0x13177a30}
      (gdb) p *$14->index[2]
      $21 = {page = 0x0, home = 0x290b0d0, pindex_hint = 2, state = 5,
        addr = 0x4cb5530, key = {recno = 4294967423, ikey = 0x10000007f},
        page_del = 0x22ba750}
      (gdb) p *$14->index[3]
      $22 = {page = 0x0, home = 0x290b0d0, pindex_hint = 3, state = 5,
        addr = 0x4cb553b, key = {recno = 4294967445, ikey = 0x100000095},
        page_del = 0x2184220}
      (gdb) p *$14->index[4]
      $23 = {page = 0x0, home = 0x290b0d0, pindex_hint = 4, state = 5,
        addr = 0x4cb5546, key = {recno = 4294967467, ikey = 0x1000000ab},
        page_del = 0x1f451c0}
      (gdb) p *$14->index[5]
      $24 = {page = 0x0, home = 0x290b0d0, pindex_hint = 5, state = 5,
        addr = 0x4cb5551, key = {recno = 4294967489, ikey = 0x1000000c1},
        page_del = 0x43e8d30}
      (gdb) p *$14->index[6]
      $25 = {page = 0x0, home = 0x290b0d0, pindex_hint = 6, state = 5,
        addr = 0x4cb555c, key = {recno = 4294967511, ikey = 0x1000000d7},
        page_del = 0x2083e20}
      (gdb) p *$14->index[7]
      $26 = {page = 0x0, home = 0x290b0d0, pindex_hint = 7, state = 5,
        addr = 0x4cb5568, key = {recno = 8589934829, ikey = 0x2000000ed},
        page_del = 0x1eee330}
      (gdb) p *$14->index[8]
      $27 = {page = 0x0, home = 0x290b0d0, pindex_hint = 8, state = 5,
        addr = 0x4cb5574, key = {recno = 8589934853, ikey = 0x200000105},
        page_del = 0x2e36f50}
      (gdb) p *$14->index[9]
      $28 = {page = 0x0, home = 0x290b0d0, pindex_hint = 9, state = 5,
        addr = 0x4cb557f, key = {recno = 8589934875, ikey = 0x20000011b},
        page_del = 0x1d597d0}
      (gdb) p *$14->index[10]
      $29 = {page = 0x0, home = 0x290b0d0, pindex_hint = 10, state = 2, addr = 0x0,
        key = {recno = 8589934899, ikey = 0x200000133}, page_del = 0x0}
      

      I haven't attempted to reproduce yet. I'll try that next.

            Assignee:
            Unassigned Unassigned
            Reporter:
            alexander.gorrod@mongodb.com Alexander Gorrod
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: