Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-4376

Fix a bug where table index open can race

    • Storage Engines 2018-11-05

      Hi!

       

      After upgrade from WT 2.9.1 to WT 3.1.0 one of our application started to crash with the following stacks:

      #0  0x00007f0b2c800cb2 in __wt_row_search (session=0x7f0b28998d80, srch_key=0x7f0b1de34a00, leaf=0x0, cbt=0x7f0b1de34900, insert=true, restore=false)
          at /tb/builds/thd/sbn/development/src/thirdparty/wiredtiger/3.1.0/src/src/btree/row_srch.c:226
      226             btree = S2BT(session);
      #0  0x00007f0b2c800cb2 in __wt_row_search (session=0x7f0b28998d80, srch_key=0x7f0b1de34a00, leaf=0x0, cbt=0x7f0b1de34900, insert=true, restore=false)
          at /tb/builds/thd/sbn/development/src/thirdparty/wiredtiger/3.1.0/src/src/btree/row_srch.c:226
      #1  0x00007f0b2c7c905e in __wt_btcur_insert (cbt=0x7f0b1de34900) at /tb/builds/thd/sbn/development/src/thirdparty/wiredtiger/3.1.0/src/src/btree/bt_cursor.c:385
      #2  0x00007f0b2c83c75d in __curfile_insert (cursor=0x7f0b1de34900) at /tb/builds/thd/sbn/development/src/thirdparty/wiredtiger/3.1.0/src/src/cursor/cur_file.c:265
      #3  0x00007f0b2c85c149 in __curextract_insert (cursor=0x7f0b1f9f82e0) at /tb/builds/thd/sbn/development/src/thirdparty/wiredtiger/3.1.0/src/src/cursor/cur_table.c:73
      #4  0x00007f0b2ca914bc in tbricks::storage::StorageWTBackend::SecIndex::s_extract (extractor=0x7f0b288bf700, session=<optimized out>, key=<optimized out>, value=<optimized out>, result_cursor=0x7f0b1f9f82e0)
          at src/StorageWTBackend.cpp:2945
      #5  0x00007f0b2c85bdde in __wt_apply_single_idx (session=0x7f0b28999600, idx=0x7f0b1b04b140, cur=<optimized out>, ctable=0x7f0b1d238000, f=<optimized out>)
          at /tb/builds/thd/sbn/development/src/thirdparty/wiredtiger/3.1.0/src/src/cursor/cur_table.c:121
      #6  0x00007f0b2c860fec in __apply_idx (ctable=0x7f0b1d238000, func_off=120, skip_immutable=false) at /tb/builds/thd/sbn/development/src/thirdparty/wiredtiger/3.1.0/src/src/cursor/cur_table.c:162
      #7  __curtable_insert (cursor=0x7f0b1d238000) at /tb/builds/thd/sbn/development/src/thirdparty/wiredtiger/3.1.0/src/src/cursor/cur_table.c:541
      #8  0x00007f0b2ca9ec91 in insert (this=<optimized out>) at src/StorageWTBackend.cpp:370
      #9  put_i (this=<optimized out>, cursor=..., key=0x7f0b1d20d010, keySize=16, data=0x7f0b1d20f000, dataSize=225) at src/StorageWTBackend.cpp:3685
      #10 tbricks::storage::StorageWTBackend::put (this=0x7f0b28816300, key=0x7f0b1d20d010, keySize=16, data=0x7f0b1d20f000, dataSize=225, overwrite=<optimized out>, txnMode=tbricks::storage::Default) at src/StorageWTBackend.cpp:3824
      <application-specific frames skipped>
      

      or

      Program terminated with signal 11, Segmentation fault.
      #0  0x00007f4f104f31b5 in __wt_hazard_set (session=0x7f4f0c599a80, ref=0x7f4efee6ddc0, busyp=0x7f4f035f7f18) at /tb/builds/thd/sbn/development/src/thirdparty/wiredtiger/3.1.0/src/src/support/hazard.c:80
      80              if (F_ISSET(S2BT(session), WT_BTREE_IN_MEMORY))
      (gdb) bt
      #0  0x00007f4f104f31b5 in __wt_hazard_set (session=0x7f4f0c599a80, ref=0x7f4efee6ddc0, busyp=0x7f4f035f7f18) at /tb/builds/thd/sbn/development/src/thirdparty/wiredtiger/3.1.0/src/src/support/hazard.c:80
      #1  0x00007f4f103e11b8 in __wt_page_in_func (session=0x7f4f0c599a80, ref=0x7f4efee6ddc0, flags=1024) at /tb/builds/thd/sbn/development/src/thirdparty/wiredtiger/3.1.0/src/src/btree/bt_read.c:686
      #2  0x00007f4f10408f0d in __wt_page_swap_func (prev_race=false, flags=1024, session=<optimized out>, held=<optimized out>, want=<optimized out>)
          at /tb/builds/thd/sbn/development/src/thirdparty/wiredtiger/3.1.0/src/src/include/btree.i:1751
      #3  __wt_row_search (session=0x7f4f0c599a80, srch_key=0x7f4efe834d00, leaf=<optimized out>, cbt=<optimized out>, insert=true, restore=<optimized out>)
          at /tb/builds/thd/sbn/development/src/thirdparty/wiredtiger/3.1.0/src/src/btree/row_srch.c:446
      #4  0x00007f4f103d005e in __wt_btcur_insert (cbt=0x7f4efe834c00) at /tb/builds/thd/sbn/development/src/thirdparty/wiredtiger/3.1.0/src/src/btree/bt_cursor.c:385
      #5  0x00007f4f1044375d in __curfile_insert (cursor=0x7f4efe834c00) at /tb/builds/thd/sbn/development/src/thirdparty/wiredtiger/3.1.0/src/src/cursor/cur_file.c:265
      #6  0x00007f4f10463149 in __curextract_insert (cursor=0x7f4f035f82e0) at /tb/builds/thd/sbn/development/src/thirdparty/wiredtiger/3.1.0/src/src/cursor/cur_table.c:73
      #7  0x00007f4f106984bc in tbricks::storage::StorageWTBackend::SecIndex::s_extract (extractor=0x7f4f0c4bf690, session=<optimized out>, key=<optimized out>, value=<optimized out>, result_cursor=0x7f4f035f82e0)
          at src/StorageWTBackend.cpp:2945
      #8  0x00007f4f10462dde in __wt_apply_single_idx (session=0x7f4f0c598540, idx=0x7f4eff04b140, cur=<optimized out>, ctable=0x7f4eff238000, f=<optimized out>)
          at /tb/builds/thd/sbn/development/src/thirdparty/wiredtiger/3.1.0/src/src/cursor/cur_table.c:121
      #9  0x00007f4f10467fec in __apply_idx (ctable=0x7f4eff238000, func_off=120, skip_immutable=false) at /tb/builds/thd/sbn/development/src/thirdparty/wiredtiger/3.1.0/src/src/cursor/cur_table.c:162
      #10 __curtable_insert (cursor=0x7f4eff238000) at /tb/builds/thd/sbn/development/src/thirdparty/wiredtiger/3.1.0/src/src/cursor/cur_table.c:541
      #11 0x00007f4f106a5c91 in insert (this=<optimized out>) at src/StorageWTBackend.cpp:370
      #12 put_i (this=<optimized out>, cursor=..., key=0x7f4eff20d010, keySize=16, data=0x7f4eff20f000, dataSize=225) at src/StorageWTBackend.cpp:3685
      #13 tbricks::storage::StorageWTBackend::put (this=0x7f4f0c416300, key=0x7f4eff20d010, keySize=16, data=0x7f4eff20f000, dataSize=225, overwrite=<optimized out>, txnMode=tbricks::storage::Default) at src/StorageWTBackend.cpp:3824
      <application-specific frames skipped>
      

      These stacks are actually from small reproducer, that simply inserts records into single table from 10 threads.
      This table has 10 indices however and all crashes we saw were on updating index table.

      The crash isn't reproduced frequently and only in optimized builds.

        1. put_idx_crash.c
          3 kB
        2. wt-4376.diff
          3 kB

            Assignee:
            keith.bostic@mongodb.com Keith Bostic (Inactive)
            Reporter:
            Dmitri Shubin Dmitri Shubin
            Votes:
            0 Vote for this issue
            Watchers:
            10 Start watching this issue

              Created:
              Updated:
              Resolved: