Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-3024

wtperf medium-lsm-compact test can hang

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • WT2.9.0, 3.2.12, 3.4.0-rc4
    • Affects Version/s: None
    • Component/s: None
    • Labels:
      None
    • Storage 2016-11-21

      Repro with

      ./bench/wtperf/wtperf -O ../bench/wtperf/runners/medium-lsm-compact.wtperf -o verbose=2
      

      Output:

      Starting 1 populate thread(s) for 50000000 items
      7831255 populate inserts (7831255 of 50000000) in 5 secs (5 total secs)
      7864217 populate inserts (15695472 of 50000000) in 5 secs (10 total secs)
      7865863 populate inserts (23561335 of 50000000) in 5 secs (15 total secs)
      7914062 populate inserts (31475397 of 50000000) in 5 secs (20 total secs)
      7838969 populate inserts (39314366 of 50000000) in 5 secs (25 total secs)
      7811496 populate inserts (47125862 of 50000000) in 5 secs (30 total secs)
      Finished load of 50000000 items
      Load time: 32.05
      load ops/sec: 1560305
      Compact after populate
      Compact completed in 0 seconds
      

      From GDB it appears that this is stuck in the close_reopen call in WTPERF, doing a __conn_close

      Thread 1 (Thread 0x7f3559a82780 (LWP 32750)):
      #0  0x00007f35589f1b27 in sched_yield () from /lib64/libc.so.6
      #1  0x000000000043e3f5 in __wt_yield () at ../src/os_posix/os_yield.c:18
      #2  0x0000000000418150 in __wt_connection_close (conn=conn@entry=0x15c4000) at ../src/conn/conn_open.c:101
      #3  0x000000000040efa6 in __conn_close (wt_conn=0x15c4000, config=0x0) at ../src/conn/conn_api.c:1089
      #4  0x0000000000409051 in close_reopen (wtperf=0x7fff97ac4600) at ../../../bench/wtperf/wtperf.c:1605
      #5  start_run (wtperf=wtperf@entry=0x7fff97ac4600) at ../../../bench/wtperf/wtperf.c:2229
      #6  0x00000000004051c1 in start_all_runs (wtperf=0x7fff97ac4600) at ../../../bench/wtperf/wtperf.c:2109
      #7  main (argc=<optimized out>, argv=<optimized out>) at ../../../bench/wtperf/wtperf.c:2598
      

      The issue appears to be that txn_global->metadata_pinned (7 in my test) is below txn_global->current (9 in test) and txn_global->oldest_id (9 in test).

            Assignee:
            michael.cahill@mongodb.com Michael Cahill (Inactive)
            Reporter:
            david.hows David Hows
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: