Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-1915

Segfault running wtperf LSM workload

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • WT2.6.0
    • Affects Version/s: None
    • Component/s: None
    • Labels:
      None

      There was a Jenkins test failure running the medium-multi-lsm wtperf workload:

      http://build.wiredtiger.com:8080/job/wiredtiger-perf-med-multi-lsm/941/console

      The failure doesn't reproduce immediately. The failure is:

      ../../../bench/wtperf/runners/wtperf_run.sh: line 147: 16623 Segmentation fault      (core dumped) LD_PRELOAD=/usr/lib64/libjemalloc.so.1 LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/lib ./wtperf -O $wttest
      

      The stack trace is:

      #0  0x0000000000419084 in __wt_cursor_init ()
      #1  0x000000000048e9ce in __wt_curfile_create ()
      #2  0x000000000048ebc9 in __wt_curfile_open ()
      #3  0x000000000044c568 in __wt_open_cursor ()
      #4  0x000000000049c04e in __clsm_open_cursors ()
      #5  0x00000000004a1b9f in __wt_clsm_init_merge ()
      #6  0x00000000004a2633 in __wt_lsm_merge ()
      #7  0x0000000000427877 in __lsm_worker ()
      #8  0x00007f82125f8f18 in start_thread () from /lib64/libpthread.so.0
      #9  0x00007f821232eb2d in clone () from /lib64/libc.so.6
      

      There are a number of other threads active concurrently:

      Thread 14 (Thread 0x7f820bffb700 (LWP 16634)):
      #0  0x00007f82123292c7 in ftruncate64 () from /lib64/libc.so.6
      #1  0x000000000042b4be in __wt_ftruncate ()
      #2  0x000000000045e831 in __wt_block_truncate ()
      #3  0x00000000004ccdc8 in __wt_block_checkpoint_unload ()
      #4  0x00000000004b6403 in __bm_checkpoint_unload ()
      #5  0x0000000000461972 in __wt_btree_close () at ../src/btree/bt_handle.c:147
      #6  0x00000000004848d0 in __wt_conn_btree_sync_and_close ()
      #7  0x000000000044de0c in __wt_session_release_btree ()
      #8  0x000000000048ce28 in __curfile_close ()
      #9  0x00000000004a2ac5 in __wt_lsm_merge ()
      #10 0x0000000000427877 in __lsm_worker ()
      #11 0x00007f82125f8f18 in start_thread () from /lib64/libpthread.so.0
      #12 0x00007f821232eb2d in clone () from /lib64/libc.so.6
      
      Thread 13 (Thread 0x7f820effa700 (LWP 16629)):
      #0  0x00007f8212322fe7 in unlink () from /lib64/libc.so.6
      #1  0x00007f82122b1259 in remove () from /lib64/libc.so.6
      #2  0x000000000042c7f6 in __wt_remove ()
      #3  0x00000000004a3fa8 in __lsm_drop_file ()
      #4  0x00000000004a4b1d in __wt_lsm_free_chunks ()
      #5  0x0000000000427959 in __lsm_worker ()
      #6  0x00007f82125f8f18 in start_thread () from /lib64/libpthread.so.0
      #7  0x00007f821232eb2d in clone () from /lib64/libc.so.6
      
      
      Thread 11 (Thread 0x7f820dfff700 (LWP 16630)):
      #0  0x00007f82125ff265 in __lll_lock_wait () from /lib64/libpthread.so.0
      #1  0x00007f82125fadc1 in _L_lock_816 () from /lib64/libpthread.so.0
      #2  0x00007f82125facc7 in pthread_mutex_lock () from /lib64/libpthread.so.0
      #3  0x000000000044e5b7 in __wt_session_get_btree ()
      #4  0x000000000044e732 in __wt_session_get_btree_ckpt ()
      #5  0x000000000048ecbd in __wt_curfile_open ()
      #6  0x000000000044c568 in __wt_open_cursor ()
      #7  0x00000000004b8f5b in __wt_bloom_hash_get ()
      #8  0x00000000004b8fc8 in __wt_bloom_get ()
      #9  0x00000000004a2bad in __wt_lsm_merge ()
      #10 0x0000000000427877 in __lsm_worker ()
      #11 0x00007f82125f8f18 in start_thread () from /lib64/libpthread.so.0
      #12 0x00007f821232eb2d in clone () from /lib64/libc.so.6
      
      Thread 8 (Thread 0x7f820d7fe700 (LWP 16631)):
      #0  0x00007f82125ff265 in __lll_lock_wait () from /lib64/libpthread.so.0
      #1  0x00007f82125fadc1 in _L_lock_816 () from /lib64/libpthread.so.0
      #2  0x00007f82125facc7 in pthread_mutex_lock () from /lib64/libpthread.so.0
      #3  0x000000000044b04f in __session_create ()
      #4  0x00000000004b8cdb in __wt_bloom_finalize ()
      #5  0x00000000004a456f in __wt_lsm_work_bloom ()
      #6  0x0000000000427995 in __lsm_worker ()
      #7  0x00007f82125f8f18 in start_thread () from /lib64/libpthread.so.0
      #8  0x00007f821232eb2d in clone () from /lib64/libc.so.6
      
      Thread 7 (Thread 0x7f820c7fc700 (LWP 16633)):
      #0  0x00007f8212f248df in bitmap_sfu (arena=0x7f8211c51ac0, tbin=0x7f8202806088, binind=3, prof_accumbytes=<value optimized out>)
          at include/jemalloc/internal/bitmap.h:137
      #1  arena_run_reg_alloc (arena=0x7f8211c51ac0, tbin=0x7f8202806088, binind=3, prof_accumbytes=<value optimized out>) at src/arena.c:325
      #2  arena_tcache_fill_small (arena=0x7f8211c51ac0, tbin=0x7f8202806088, binind=3, prof_accumbytes=<value optimized out>) at src/arena.c:1348
      #3  0x00007f8212f3d6ff in tcache_alloc_small_hard (tcache=<value optimized out>, tbin=0x7f8202806088, binind=<value optimized out>) at src/tcache.c:72
      #4  0x00007f8212f1d85a in tcache_alloc_small (num=<value optimized out>, size=<value optimized out>) at include/jemalloc/internal/tcache.h:302
      #5  arena_malloc (num=<value optimized out>, size=<value optimized out>) at include/jemalloc/internal/arena.h:916
      #6  icallocx (num=<value optimized out>, size=<value optimized out>) at include/jemalloc/internal/jemalloc_internal.h:800
      #7  icalloc (num=<value optimized out>, size=<value optimized out>) at include/jemalloc/internal/jemalloc_internal.h:809
      #8  calloc (num=<value optimized out>, size=<value optimized out>) at src/jemalloc.c:1079
      #9  0x0000000000429d60 in __wt_calloc ()
      #10 0x0000000000465eb3 in __wt_page_alloc ()
      #11 0x0000000000465fec in __wt_page_inmem ()
      #12 0x0000000000468568 in __wt_cache_read ()
      #13 0x0000000000465933 in __wt_page_in_func ()
      #14 0x000000000047866d in __wt_tree_walk ()
      #15 0x00000000004b9a86 in __wt_btcur_next ()
      #16 0x000000000048c9fa in __curfile_next ()
      #17 0x000000000049db36 in __clsm_next ()
      #18 0x00000000004a4542 in __wt_lsm_work_bloom ()
      #19 0x0000000000427995 in __lsm_worker ()
      #20 0x00007f82125f8f18 in start_thread () from /lib64/libpthread.so.0
      #21 0x00007f821232eb2d in clone () from /lib64/libc.so.6
      

      The failure happened towards the end of a load phase. The final three lines of WT_TEST/test.stat:

      579597 populate inserts (44372348 of 50000000) in 5 secs (180 total secs)
      379201 populate inserts (44751549 of 50000000) in 5 secs (185 total secs)
      984072 populate inserts (45735621 of 50000000) in 5 secs (190 total secs)
      

            Assignee:
            alexander.gorrod@mongodb.com Alexander Gorrod
            Reporter:
            alexander.gorrod@mongodb.com Alexander Gorrod
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: