Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-909

Pageout assertion failure

    • Type: Icon: Task Task
    • Resolution: Done
    • WT2.2
    • Affects Version/s: None
    • Component/s: None
    • Labels:

      There have been several identical failures with test/format in the last few days. The assertion failure is:

      [1394651873:827368][4023:008723da987f0000], t, file:wt.wt, cursor.update: ../src/btree/bt_discard.c, 45: !F_ISSET_ATOMIC(page, WT_PAGE_EVICT_LRU)

      If you look at Jenkins job wiredtiger-test-format-stress, jobs 3577, 3583 and 3585 all failed with this assertion. I'm running the configs from 3585 and 3577 and it is not readily reproducing, although this number of failures seems to indicate it is fairly regular.

      The stack is:

      (gdb) bt
      #0  0x0000003788a328a5 in raise () from /lib64/libc.so.6
      WT-1  0x0000003788a34085 in abort () from /lib64/libc.so.6
      WT-2  0x0000000000499ab3 in __wt_abort (session=0x1788ef0)
          at ../src/os_posix/os_abort.c:24
      WT-3  0x00000000004429c5 in __wt_assert (session=0x1788ef0, error=0, 
          file_name=0x667452 "../src/btree/bt_discard.c", line_number=45, 
          fmt=0x667414 "%s") at ../src/support/err.c:470
      WT-4  0x00000000004b1821 in __wt_page_out (session=0x1788ef0, 
          pagep=0x7f98da2379d0) at ../src/btree/bt_discard.c:45
      WT-5  0x00000000004660f6 in __rec_discard_tree (session=0x1788ef0, page=0x0, 
          exclusive=0) at ../src/btree/rec_evict.c:273
      WT-6  0x0000000000465d04 in __wt_rec_evict (session=0x1788ef0, 
          page=0x7f98b8045ae0, exclusive=0) at ../src/btree/rec_evict.c:120
      WT-7  0x000000000044f94c in __wt_evict_page (session=0x1788ef0, 
          page=0x7f98b8045ae0) at ../src/btree/bt_evict.c:404
      WT-8  0x0000000000450fb0 in __wt_evict_lru_page (session=0x1788ef0, is_app=1)
          at ../src/btree/bt_evict.c:1247
      WT-9  0x00000000004ac26c in __wt_cache_full_check (session=0x1788ef0)
          at ../src/include/cache.i:93
      WT-10 0x00000000004ac471 in __cursor_enter (session=0x1788ef0)
          at ../src/include/cursor.i:57
      WT-11 0x00000000004ac54a in __curfile_enter (cbt=0x7f986496c690)
          at ../src/include/cursor.i:94
      WT-12 0x00000000004ac672 in __cursor_func_init (cbt=0x7f986496c690, reenter=1)
          at ../src/include/cursor.i:141
      WT-13 0x00000000004ad859 in __wt_btcur_update (cbt=0x7f986496c690)
          at ../src/btree/bt_cursor.c:469
      WT-14 0x0000000000483e89 in __curfile_update (cursor=0x7f986496c690)
          at ../src/cursor/cur_file.c:262
      WT-15 0x0000000000410ffa in col_update (cursor=0x7f986496c690, 
          key=0x7f98da237e10, value=0x7f98da237de0, keyno=1432)
          at ../../../test/format/ops.c:748
      WT-16 0x000000000040ffa5 in ops (arg=0x17b3d10) at ../../../test/format/ops.c:377
      WT-17 0x0000003789207851 in start_thread () from /lib64/libpthread.so.0
      WT-18 0x0000003788ae767d in clone () from /lib64/libc.so.6

      The page in question:

      (gdb) p/x *page
      $2 = {parent = 0x0, ref = 0x0, u = {intl = {recno = 0x1e15e, 
            t = 0x7f98b8045b38}, row = {d = 0x1e15e, ins = 0x7f98b8045b38, 
            upd = 0x0}, col_fix = {recno = 0x1e15e, bitf = 0x7f98b8045b38}, 
          col_var = {recno = 0x1e15e, d = 0x7f98b8045b38, repeats = 0x0, 
            nrepeats = 0x0}}, dsk = 0x0, modify = 0x7f98b8047380, read_gen = 0x4eb, 
        memory_footprint = 0x552, entries = 0x1b, type = 0x3, flags_atomic = 0x8}

      The WT_PAGE_EVICT_LRU flag is set only in _evict_init_candidate called from _evict_walk_file, but the page is not in the cache->evict array. So it seems it might be leftover from an earlier walk and somewhere the flag is not getting cleared.

            michael.cahill@mongodb.com Michael Cahill (Inactive)
            sue.loverso@mongodb.com Susan LoVerso
            0 Vote for this issue
            1 Start watching this issue