Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-2560

Stuck trying to update oldest transaction ID

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • WT2.9.0, 3.2.7, 3.3.8
    • Affects Version/s: None
    • Component/s: None
    • Labels:
      None

      There is a hang in test/format on Jenkins. The stack traces indicate that nearly all 34 threads are in __wt_txn_update_oldest, and they have passed the force flag in. Call stack similar to:

      Thread 4 (Thread 0x3fff29fff1b0 (LWP 22019)):
      #0  0x00000000100b44a8 in __wt_atomic_subiv32 (vp=0x1000427ef38, v=1)
          at ../src/include/gcc.h:138
      #1  0x00000000100b576c in __wt_txn_update_oldest (session=0x1000428e120,
          force=true) at ../src/txn/txn.c:336
      #2  0x0000000010040a14 in __evict_review (session=0x1000428e120,
          ref=0x10005411740, inmem_splitp=0x3fff29ffd110, closing=false)
          at ../src/evict/evict_page.c:423
      #3  0x000000001003f7dc in __wt_evict (session=0x1000428e120,
          ref=0x10005411740, closing=false) at ../src/evict/evict_page.c:81
      #4  0x000000001003dd9c in __evict_page (session=0x1000428e120, is_server=false)
          at ../src/evict/evict_lru.c:1650
      #5  0x000000001003e0c0 in __wt_cache_eviction_worker (session=0x1000428e120,
          busy=false, pct_full=96) at ../src/evict/evict_lru.c:1729
      #6  0x0000000010184690 in __wt_cache_eviction_check (session=0x1000428e120,
          busy=false, didworkp=0x0) at ../src/include/cache.i:253
      #7  0x0000000010184c58 in __wt_txn_idle_cache_check (session=0x1000428e120)
          at ../src/include/txn.i:312
      #8  0x00000000101863d8 in __cursor_func_init (cbt=0x3fff6803d640, reenter=true)
          at ../src/include/cursor.i:263
      #9  0x0000000010187d8c in __wt_btcur_insert (cbt=0x3fff6803d640)
          at ../src/btree/bt_cursor.c:519
      #10 0x000000001012f0b8 in __curfile_insert (cursor=0x3fff6803d640)
          at ../src/cursor/cur_file.c:245
      

      Reviewing the code in __wt_txn_update_oldest, if the scan_count never gets to zero, none of the threads will ever be able to update the oldest read generation, so we can get into a state where all threads spin trying and failing to update the oldest ID.

            Assignee:
            michael.cahill@mongodb.com Michael Cahill (Inactive)
            Reporter:
            alexander.gorrod@mongodb.com Alexander Gorrod
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated:
              Resolved: