Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-4119

Avoid restarts updating / removing during a column store scan

    • Type: Icon: Bug Bug
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 3.6.9, 4.0.3, 4.1.3, WT3.2.0
    • Affects Version/s: None
    • Component/s: None
    • Labels:
    • Storage Non-NYC 2018-06-18, Storage Non-NYC 2018-07-02, Storage Non-NYC 2018-07-16, Storage Engines 2018-07-30, Storage Engines 2018-08-13, Storage Engines 2018-08-27

      format timeout with cache stuck, thread stuck in truncate.


      The test failure was on the PPC, but reproduces for me on an x86. It's a column-store failure with a large number of worker threads (29) and a relatively small cache (44), but at the failure there's only one thread still running, and it's processing a truncate. All of the other threads have exited from their worker mode.

      Thread 9 (Thread 0x7f3e5e7e4700 (LWP 15364)):
      #0  0x00007f3e75b1a3c7 in _int_malloc () from /lib64/libc.so.6
      #1  0x00007f3e75b1db64 in calloc () from /lib64/libc.so.6
      #2  0x0000000000452719 in __wt_calloc (session=0x7f3e77304520, number=1,
          size=160, retp=0x7f3e5e7e0f70) at src/os_common/os_alloc.c:52
      #3  0x000000000050279c in __wt_col_modify (session=0x7f3e77304520,
          cbt=0x7f3e5e7e1040, recno=578343, value=0x0, upd_arg=0x7f3e0c2ec0e0,
          modify_type=0, exclusive=true) at src/btree/col_modify.c:134
      #4  0x00000000004f0feb in __split_multi_inmem (session=0x7f3e77304520,
          orig=0x7f3e0c33dd10, multi=0x7f3e0c381810, ref=0x7f3e0c182f40)
          at src/btree/bt_split.c:1506
      #5  0x00000000004f2ebd in __wt_split_rewrite (session=0x7f3e77304520,
          ref=0xf56270, multi=0x7f3e0c381810) at src/btree/bt_split.c:2297
      #6  0x0000000000442727 in __evict_page_dirty_update (session=0x7f3e77304520,
          ref=0xf56270, closing=false) at src/evict/evict_page.c:375
      #7  0x0000000000441fc7 in __wt_evict (session=0x7f3e77304520, ref=0xf56270,
          closing=false) at src/evict/evict_page.c:214
      #8  0x000000000044191f in __wt_page_release_evict (session=0x7f3e77304520,
          ref=0xf56270) at src/evict/evict_page.c:85
      #9  0x00000000004df4af in __wt_page_in_func (session=0x7f3e77304520,
          ref=0xf56270, flags=1024,
          file=0x5cb080 <__func__.16268> "__wt_col_search", line=195)
          at src/btree/bt_read.c:731
      #10 0x0000000000503cb2 in __wt_page_swap_func (session=0x7f3e77304520,
          held=0xafc0f8, want=0xf56270, prev_race=false, flags=1024,
          file=0x5cb080 <__func__.16268> "__wt_col_search", line=195)
          at ./src/include/btree.i:1751
      #11 0x0000000000504895 in __wt_col_search (session=0x7f3e77304520,
          search_recno=580895, leaf=0x0, cbt=0x7f3e0c0518a0, restore=false)
          at src/btree/col_srch.c:194
      #12 0x00000000005908bb in __cursor_col_search (session=0x7f3e77304520,
          cbt=0x7f3e0c0518a0, leaf=0x0) at src/btree/bt_cursor.c:370
      #13 0x0000000000590fd6 in __wt_btcur_search (cbt=0x7f3e0c0518a0)
          at src/btree/bt_cursor.c:530
      #14 0x0000000000593bc9 in __cursor_truncate (session=0x7f3e77304520,
          start=0x7f3e0c0518a0, stop=0x7f3e0c05c2b0,
          rmfunc=0x590a10 <__cursor_col_modify>) at src/btree/bt_cursor.c:1681
      #15 0x0000000000593f7b in __wt_btcur_range_truncate (start=0x7f3e0c0518a0,
          stop=0x7f3e0c05c2b0) at src/btree/bt_cursor.c:1798
      #16 0x000000000057b4d4 in __wt_schema_range_truncate (session=0x7f3e77304520,
          start=0x7f3e0c0518a0, stop=0x7f3e0c05c2b0)
          at src/schema/schema_truncate.c:145
      #17 0x000000000048c09a in __wt_session_range_truncate (session=0x7f3e77304520,
          uri=0x0, start=0x7f3e0c0518a0, stop=0x7f3e0c05c2b0)
          at src/session/session_api.c:1433
      #18 0x000000000048c9c9 in __session_truncate (wt_session=0x7f3e77304520,
          uri=0x0, start=0x7f3e0c0518a0, stop=0x7f3e0c05c2b0, config=0x0)
          at src/session/session_api.c:1507
      #19 0x000000000040b4a5 in col_truncate (tinfo=0xf576b0, cursor=0x7f3e0c0518a0)
          at ops.c:1761
      #20 0x0000000000409210 in ops (arg=0xf576b0) at ops.c:1031
      #21 0x00007f3e76721e25 in start_thread () from /lib64/libpthread.so.0
      #22 0x00007f3e75b9534d in clone () from /lib64/libc.so.6

            michael.cahill@mongodb.com Michael Cahill (Inactive)
            keith.bostic@mongodb.com Keith Bostic (Inactive)
            0 Vote for this issue
            5 Start watching this issue