Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-3009

Test format hung/aborted with small cache (sub 10mB)

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: WT2.9.0, 3.2.11, 3.4.0-rc3
    • Labels:
      None
    • Sprint:
      Storage 2016-11-21

      Description

      Test format has become hung on zSeries.

      Config is as follows

      ############################################
      #  RUN PARAMETERS
      ############################################
      abort=0
      auto_throttle=1
      backups=0
      bitcnt=8
      bloom=1
      bloom_bit_count=17
      bloom_hash_count=16
      bloom_oldest=0
      cache=3
      checkpoints=1
      checksum=uncompressed
      chunk_size=7
      compaction=0
      compression=zlib
      data_extend=0
      data_source=table
      delete_pct=43
      dictionary=0
      direct_io=0
      encryption=rotn-7
      evict_max=4
      file_type=row-store
      firstfit=0
      huffman_key=0
      huffman_value=0
      in_memory=0
      insert_pct=29
      internal_key_truncation=1
      internal_page_max=14
      isolation=random
      key_gap=8
      key_max=77
      key_min=18
      leaf_page_max=9
      leak_memory=0
      logging=0
      logging_archive=0
      logging_compression=none
      logging_prealloc=1
      long_running_txn=0
      lsm_worker_threads=4
      merge_max=12
      mmap=1
      ops=100000
      prefix_compression=1
      prefix_compression_min=7
      quiet=1
      repeat_data_pct=58
      reverse=0
      rows=100000
      runs=100
      rebalance=1
      salvage=1
      split_pct=49
      statistics=0
      statistics_server=0
      threads=3
      timer=20
      transaction-frequency=91
      value_max=2664
      value_min=9
      verify=1
      wiredtiger_config=
      write_pct=28
      ############################################
      

      Dump of cache shows that cache is 100% full

      ==========
      cache dump
      file:wt.wt(<live>):
      	internal: 32 pages, 1MB, 0/32 clean/dirty pages, 0/1 clean/dirty MB, 0MB max page, 0MB max dirty page
      	leaf: 748 pages, 1MB, 1/747 clean/dirty pages, 0/1 clean/dirty MB, 0MB max page, 0MB max dirty page
      file:WiredTigerLAS.wt(<live>):
      	internal: 1 pages, 0MB, 1/0 clean/dirty pages, 0/0 clean/dirty MB, 0MB max page, 0MB max dirty page
      file:WiredTiger.wt(<live>):
      	internal: 1 pages, 0MB, 1/0 clean/dirty pages, 0/0 clean/dirty MB, 0MB max page, 0MB max dirty page
      	leaf: 1 pages, 0MB, 1/0 clean/dirty pages, 0/0 clean/dirty MB, 0MB max page, 0MB max dirty page
      cache dump: total found = 2MB vs tracked inuse 2MB
      total dirty bytes = 2MB
      ==========
      

      Full stack from GDB

      (gdb) thread apply all bt
       
      Thread 9 (Thread 0x3ff43fff910 (LWP 48146)):
      #0  0x000003ff9048c436 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
      #1  0x00000000800687c0 in __wt_cond_wait_signal (session=0xae1626d0, cond=0xae1bcb60, usecs=10000, signalled=0x3ff43ffed8f) at ../src/os_posix/os_mtx_cond.c:71
      #2  0x0000000080037f74 in __wt_cond_wait (session=0xae1626d0, cond=0xae1bcb60, usecs=10000) at ../src/include/misc.i:18
      #3  0x000000008003a6b0 in __evict_lru_pages (session=0xae1626d0, is_server=false) at ../src/evict/evict_lru.c:863
      #4  0x0000000080038988 in __wt_evict_thread_run (session=0xae1626d0, thread=0xae1c85b0) at ../src/evict/evict_lru.c:265
      #5  0x00000000800bd83c in __wt_thread_run (arg=0xae1c85b0) at ../src/support/thread_group.c:25
      #6  0x000003ff9048820a in start_thread () from /lib64/libpthread.so.0
      #7  0x000003ff901fda52 in thread_start () from /lib64/libc.so.6
       
      Thread 8 (Thread 0x3ff8e5ff910 (LWP 48147)):
      #0  0x000003ff9048c436 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
      #1  0x00000000800687c0 in __wt_cond_wait_signal (session=0xae1636d0, cond=0xae1c8a10, usecs=10000000, signalled=0x3ff8e5fef0f) at ../src/os_posix/os_mtx_cond.c:71
      #2  0x00000000800292dc in __wt_cond_wait (session=0xae1636d0, cond=0xae1c8a10, usecs=10000000) at ../src/include/misc.i:18
      #3  0x0000000080029ed0 in __sweep_server (arg=0xae1636d0) at ../src/conn/conn_sweep.c:272
      #4  0x000003ff9048820a in start_thread () from /lib64/libpthread.so.0
      #5  0x000003ff901fda52 in thread_start () from /lib64/libc.so.6
       
      Thread 7 (Thread 0x3ff8ddff910 (LWP 48148)):
      #0  0x000003ff9048c436 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
      #1  0x00000000800687c0 in __wt_cond_wait_signal (session=0xae162ad0, cond=0xae161030, usecs=1000, signalled=0x3ff8ddfec3f) at ../src/os_posix/os_mtx_cond.c:71
      #2  0x0000000080037f74 in __wt_cond_wait (session=0xae162ad0, cond=0xae161030, usecs=1000) at ../src/include/misc.i:18
      #3  0x0000000080039b18 in __evict_pass (session=0xae162ad0) at ../src/evict/evict_lru.c:651
      #4  0x0000000080038cf6 in __evict_server (session=0xae162ad0, did_work=0x3ff8ddfef0b) at ../src/evict/evict_lru.c:318
      #5  0x00000000800388b6 in __wt_evict_thread_run (session=0xae162ad0, thread=0xae1c8490) at ../src/evict/evict_lru.c:254
      #6  0x00000000800bd83c in __wt_thread_run (arg=0xae1c8490) at ../src/support/thread_group.c:25
      #7  0x000003ff9048820a in start_thread () from /lib64/libpthread.so.0
      #8  0x000003ff901fda52 in thread_start () from /lib64/libc.so.6
       
      Thread 6 (Thread 0x3ff8cdff910 (LWP 48149)):
      #0  0x000003ff9048c436 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
      #1  0x00000000800687c0 in __wt_cond_wait_signal (session=0xae162ed0, cond=0xae1bcb60, usecs=10000, signalled=0x3ff8cdfed8f) at ../src/os_posix/os_mtx_cond.c:71
      #2  0x0000000080037f74 in __wt_cond_wait (session=0xae162ed0, cond=0xae1bcb60, usecs=10000) at ../src/include/misc.i:18
      #3  0x000000008003a6b0 in __evict_lru_pages (session=0xae162ed0, is_server=false) at ../src/evict/evict_lru.c:863
      #4  0x0000000080038988 in __wt_evict_thread_run (session=0xae162ed0, thread=0xae1cf430) at ../src/evict/evict_lru.c:265
      #5  0x00000000800bd83c in __wt_thread_run (arg=0xae1cf430) at ../src/support/thread_group.c:25
      #6  0x000003ff9048820a in start_thread () from /lib64/libpthread.so.0
      #7  0x000003ff901fda52 in thread_start () from /lib64/libc.so.6
       
      Thread 5 (Thread 0x3ff757ff910 (LWP 48150)):
      #0  0x000003ff9048c436 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
      #1  0x00000000800687c0 in __wt_cond_wait_signal (session=0xae1632d0, cond=0xae1bcb60, usecs=10000, signalled=0x3ff757fed8f) at ../src/os_posix/os_mtx_cond.c:71
      #2  0x0000000080037f74 in __wt_cond_wait (session=0xae1632d0, cond=0xae1bcb60, usecs=10000) at ../src/include/misc.i:18
      #3  0x000000008003a6b0 in __evict_lru_pages (session=0xae1632d0, is_server=false) at ../src/evict/evict_lru.c:863
      #4  0x0000000080038988 in __wt_evict_thread_run (session=0xae1632d0, thread=0xae1b52c0) at ../src/evict/evict_lru.c:265
      #5  0x00000000800bd83c in __wt_thread_run (arg=0xae1b52c0) at ../src/support/thread_group.c:25
      #6  0x000003ff9048820a in start_thread () from /lib64/libpthread.so.0
      #7  0x000003ff901fda52 in thread_start () from /lib64/libc.so.6
       
      Thread 4 (Thread 0x3ff8fdff910 (LWP 48151)):
      #0  0x000003ff9048c436 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
      #1  0x00000000800687c0 in __wt_cond_wait_signal (session=0xae163ad0, cond=0xae1bcb60, usecs=10000, signalled=0x3ff8fdfd6d7) at ../src/os_posix/os_mtx_cond.c:71
      #2  0x0000000080037f74 in __wt_cond_wait (session=0xae163ad0, cond=0xae1bcb60, usecs=10000) at ../src/include/misc.i:18
      #3  0x000000008003d54e in __wt_cache_eviction_worker (session=0xae163ad0, busy=true, pct_full=104) at ../src/evict/evict_lru.c:1862
      #4  0x000000008019528c in __wt_cache_eviction_check (session=0xae163ad0, busy=true, didworkp=0x0) at ../src/include/cache.i:393
      #5  0x000000008019693a in __cursor_enter (session=0xae163ad0) at ../src/include/cursor.i:56
      #6  0x0000000080196a6e in __curfile_enter (cbt=0x3ff68dcf210) at ../src/include/cursor.i:90
      #7  0x0000000080196cae in __cursor_func_init (cbt=0x3ff68dcf210, reenter=true) at ../src/include/cursor.i:263
      #8  0x0000000080199700 in __wt_btcur_update (cbt=0x3ff68dcf210) at ../src/btree/bt_cursor.c:773
      #9  0x000000008013df6a in __curfile_update (cursor=0x3ff68dcf210) at ../src/cursor/cur_file.c:284
      #10 0x000000008000a9c8 in row_update (cursor=0x3ff68dcf210, key=0x3ff8fdfdf10, value=0x3ff8fdfdee8, keyno=50579) at ../../../test/format/ops.c:1053
      #11 0x0000000080009ad2 in ops (arg=0xae2c2980) at ../../../test/format/ops.c:666
      #12 0x000003ff9048820a in start_thread () from /lib64/libpthread.so.0
      #13 0x000003ff901fda52 in thread_start () from /lib64/libc.so.6
       
      Thread 3 (Thread 0x3ff8f5ff910 (LWP 48152)):
      #0  0x000003ff9048c436 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
      #1  0x00000000800687c0 in __wt_cond_wait_signal (session=0xae163ed0, cond=0xae1bcb60, usecs=10000, signalled=0x3ff8f5fd6d7) at ../src/os_posix/os_mtx_cond.c:71
      #2  0x0000000080037f74 in __wt_cond_wait (session=0xae163ed0, cond=0xae1bcb60, usecs=10000) at ../src/include/misc.i:18
      #3  0x000000008003d54e in __wt_cache_eviction_worker (session=0xae163ed0, busy=false, pct_full=104) at ../src/evict/evict_lru.c:1862
      #4  0x000000008019528c in __wt_cache_eviction_check (session=0xae163ed0, busy=false, didworkp=0x0) at ../src/include/cache.i:393
      #5  0x000000008019693a in __cursor_enter (session=0xae163ed0) at ../src/include/cursor.i:56
      #6  0x0000000080196a6e in __curfile_enter (cbt=0x3ff68d82190) at ../src/include/cursor.i:90
      #7  0x0000000080196cae in __cursor_func_init (cbt=0x3ff68d82190, reenter=true) at ../src/include/cursor.i:263
      #8  0x0000000080199700 in __wt_btcur_update (cbt=0x3ff68d82190) at ../src/btree/bt_cursor.c:773
      #9  0x000000008013df6a in __curfile_update (cursor=0x3ff68d82190) at ../src/cursor/cur_file.c:284
      #10 0x000000008000a9c8 in row_update (cursor=0x3ff68d82190, key=0x3ff8f5fdf10, value=0x3ff8f5fdee8, keyno=88337) at ../../../test/format/ops.c:1053
      #11 0x0000000080009ad2 in ops (arg=0xae2c2a80) at ../../../test/format/ops.c:666
      #12 0x000003ff9048820a in start_thread () from /lib64/libpthread.so.0
      #13 0x000003ff901fda52 in thread_start () from /lib64/libc.so.6
       
      ---Type <return> to continue, or q <return> to quit---
      Thread 2 (Thread 0x3ff8edff910 (LWP 48153)):
      #0  0x000003ff9048c436 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
      #1  0x00000000800687c0 in __wt_cond_wait_signal (session=0xae1642d0, cond=0xae1bcb60, usecs=10000, signalled=0x3ff8edfd57f) at ../src/os_posix/os_mtx_cond.c:71
      #2  0x0000000080037f74 in __wt_cond_wait (session=0xae1642d0, cond=0xae1bcb60, usecs=10000) at ../src/include/misc.i:18
      #3  0x000000008003d54e in __wt_cache_eviction_worker (session=0xae1642d0, busy=true, pct_full=104) at ../src/evict/evict_lru.c:1862
      #4  0x00000000800f12b4 in __wt_cache_eviction_check (session=0xae1642d0, busy=true, didworkp=0x0) at ../src/include/cache.i:393
      #5  0x00000000800f4982 in __wt_page_in_func (session=0xae1642d0, ref=0x3ff691b7f60, flags=780, file=0x801cb8fc "../src/btree/bt_walk.c", line=518) at ../src/btree/bt_read.c:521
      #6  0x0000000080117fbe in __wt_page_swap_func (session=0xae1642d0, held=0x3ff68c95760, want=0x3ff691b7f60, flags=780, file=0x801cb8fc "../src/btree/bt_walk.c", line=518) at ../src/include/btree.i:1392
      #7  0x0000000080118e20 in __tree_walk_internal (session=0xae1642d0, refp=0x3ff6927fb98, walkcntp=0x0, skipleafcntp=0x0, flags=520) at ../src/btree/bt_walk.c:517
      #8  0x00000000801191a0 in __wt_tree_walk (session=0xae1642d0, refp=0x3ff6927fb98, flags=512) at ../src/btree/bt_walk.c:665
      #9  0x000000008018e83c in __wt_btcur_next (cbt=0x3ff6927fa50, truncating=false) at ../src/btree/bt_curnext.c:671
      #10 0x000000008013c340 in __curfile_next (cursor=0x3ff6927fa50) at ../src/cursor/cur_file.c:113
      #11 0x000000008000a6c0 in nextprev (cursor=0x3ff6927fa50, next=1) at ../../../test/format/ops.c:931
      #12 0x0000000080009cee in ops (arg=0xae2c2b80) at ../../../test/format/ops.c:707
      #13 0x000003ff9048820a in start_thread () from /lib64/libpthread.so.0
      #14 0x000003ff901fda52 in thread_start () from /lib64/libc.so.6
       
      Thread 1 (Thread 0x3ff90775710 (LWP 46408)):
      #0  0x000003ff901be5f0 in ?? () from /lib64/libc.so.6
      #1  0x000003ff901f4c7e in usleep () from /lib64/libc.so.6
      #2  0x000000008000817c in wts_ops (lastrun=0) at ../../../test/format/ops.c:172
      #3  0x000000008000c792 in main (argc=0, argv=0x3ffebc7f510) at ../../../test/format/t.c:216
      (gdb) thread 3
      [Switching to thread 3 (Thread 0x3ff8f5ff910 (LWP 48152))]
      #0  0x000003ff9048c436 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
      (gdb) frame 3
      #3  0x000000008003d54e in __wt_cache_eviction_worker (session=0xae163ed0, busy=false, pct_full=104) at ../src/evict/evict_lru.c:1862
      1862				__wt_cond_wait(
      (gdb) bt
      #0  0x000003ff9048c436 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
      #1  0x00000000800687c0 in __wt_cond_wait_signal (session=0xae163ed0, cond=0xae1bcb60, usecs=10000, signalled=0x3ff8f5fd6d7) at ../src/os_posix/os_mtx_cond.c:71
      #2  0x0000000080037f74 in __wt_cond_wait (session=0xae163ed0, cond=0xae1bcb60, usecs=10000) at ../src/include/misc.i:18
      #3  0x000000008003d54e in __wt_cache_eviction_worker (session=0xae163ed0, busy=false, pct_full=104) at ../src/evict/evict_lru.c:1862
      #4  0x000000008019528c in __wt_cache_eviction_check (session=0xae163ed0, busy=false, didworkp=0x0) at ../src/include/cache.i:393
      #5  0x000000008019693a in __cursor_enter (session=0xae163ed0) at ../src/include/cursor.i:56
      #6  0x0000000080196a6e in __curfile_enter (cbt=0x3ff68d82190) at ../src/include/cursor.i:90
      #7  0x0000000080196cae in __cursor_func_init (cbt=0x3ff68d82190, reenter=true) at ../src/include/cursor.i:263
      #8  0x0000000080199700 in __wt_btcur_update (cbt=0x3ff68d82190) at ../src/btree/bt_cursor.c:773
      #9  0x000000008013df6a in __curfile_update (cursor=0x3ff68d82190) at ../src/cursor/cur_file.c:284
      #10 0x000000008000a9c8 in row_update (cursor=0x3ff68d82190, key=0x3ff8f5fdf10, value=0x3ff8f5fdee8, keyno=88337) at ../../../test/format/ops.c:1053
      #11 0x0000000080009ad2 in ops (arg=0xae2c2a80) at ../../../test/format/ops.c:666
      #12 0x000003ff9048820a in start_thread () from /lib64/libpthread.so.0
      #13 0x000003ff901fda52 in thread_start () from /lib64/libc.so.6
      

        Attachments

          Activity

            People

            • Assignee:
              david.hows David Hows (Inactive)
              Reporter:
              david.hows David Hows (Inactive)
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: