Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-1048

Attempt to read a freed `WT_REF`

    • Type: Icon: Task Task
    • Resolution: Done
    • WT2.2.1
    • Affects Version/s: None
    • Component/s: None
    • Labels:

      From http://mjc.homeunix.org:8180/job/wiredtiger-test-format-stress/4464/console

      The test/format CONFIG is:

      ############################################
      #  RUN PARAMETERS
      ############################################
      auto_throttle=0
      firstfit=1
      # bitcnt not applicable to this run
      bloom=1
      bloom_bit_count=32
      bloom_hash_count=8
      bloom_oldest=0
      cache=1
      checkpoints=1
      checksum=uncompressed
      chunk_size=2
      compaction=0
      compression=zlib
      data_extend=0
      data_source=table
      delete_pct=1
      dictionary=0
      file_type=row-store
      hot_backups=0
      huffman_key=0
      huffman_value=0
      insert_pct=25
      internal_key_truncation=1
      internal_page_max=10
      isolation=snapshot
      key_gap=15
      key_max=37
      key_min=30
      leaf_page_max=15
      logging=0
      merge_max=14
      merge_threads=2
      mmap=1
      ops=100000
      prefix_compression=1
      prefix_compression_min=8
      repeat_data_pct=60
      reverse=0
      rows=100000
      runs=100
      split_pct=76
      statistics=1
      threads=9
      value_max=3013
      value_min=1
      # wiredtiger_config not applicable to this run
      write_pct=32
      ############################################
      

      The recent change here is to run at snapshot isolation, but I don't immediately see why that would matter.

      The stack is:

      WT-3  0x0000000000446f54 in __wt_illegal_value (session=0x1755340, name=0x0) at ../src/support/err.c:497
      WT-4  0x0000000000460b01 in __wt_page_in_func (session=0x1755340, ref=0x7f3f4c060230, flags=0, file=0x672db0 "../src/btree/row_srch.c", line=284) at ../src/btree/bt_page.c:110
      WT-5  0x000000000048453c in __wt_page_swap_func (session=0x1755340, held=0x1783fb8, want=0x7f3f4c060230, flags=0, file=0x672db0 "../src/btree/row_srch.c", line=284)
          at ../src/include/btree.i:752
      WT-6  0x00000000004851f0 in __wt_row_search (session=0x1755340, srch_key=0x7f3f1c007728, leaf=0x0, cbt=0x7f3f1c007660, insert=0) at ../src/btree/row_srch.c:284
      WT-7  0x00000000004c1a18 in __cursor_row_search (session=0x1755340, cbt=0x7f3f1c007660, insert=0) at ../src/btree/bt_cursor.c:158
      WT-8  0x00000000004c1c7f in __wt_btcur_search (cbt=0x7f3f1c007660) at ../src/btree/bt_cursor.c:230
      WT-9  0x0000000000491265 in __curfile_search (cursor=0x7f3f1c007660) at ../src/cursor/cur_file.c:177
      WT-10 0x00000000004107eb in read_row (cursor=0x7f3f1c007660, key=0x7f3f48df9e10, keyno=14773) at ../../../test/format/ops.c:527
      

      And the ref in question has been freed:

      (gdb) print *ref
      $1 = {page = 0x7f3f4c0602a0, home = 0xabababababababab, ref_hint = 2880154539, state = 2880154539, addr = 0xabababababababab, key = {recno = 12370169555311111083,
          ikey = 0xabababababababab, pkey = 12370169555311111083}, page_del = 0xabababababababab}
      

      At frame 5, in __wt_row_search, many of the entries in the index have been freed:

      (gdb) print base
      $5 = 55
      (gdb) print cmp
      $6 = 1
      (gdb) print *pindex
      $7 = {entries = 660, index = 0x7f3f1c098620}
      (gdb) print child
      $8 = (WT_REF *) 0x7f3f4c060230
      (gdb) print pindex->index[54]
      $9 = (WT_REF *) 0x7f3f4c060230
      (gdb) print *pindex->index[53]
      $10 = {page = 0x7f3f4c10a1a0, home = 0xabababababababab, ref_hint = 2880154539, state = 2880154539, addr = 0xabababababababab, key = {recno = 12370169555311111083,
          ikey = 0xabababababababab, pkey = 12370169555311111083}, page_del = 0xabababababababab}
      (gdb) print *pindex->index[54]
      $11 = {page = 0x7f3f4c0602a0, home = 0xabababababababab, ref_hint = 2880154539, state = 2880154539, addr = 0xabababababababab, key = {recno = 12370169555311111083,
          ikey = 0xabababababababab, pkey = 12370169555311111083}, page_del = 0xabababababababab}
      (gdb) print *pindex->index[55]
      $12 = {page = 0x7f3f4c060220, home = 0xabababababababab, ref_hint = 2880154539, state = 2880154539, addr = 0xabababababababab, key = {recno = 12370169555311111083,
          ikey = 0xabababababababab, pkey = 12370169555311111083}, page_del = 0xabababababababab}
      

      The page has been split since the search started, because pindex is not equal to page->u.intl.__index. The reading session has a valid snapshot, and has published a snap_min, all of which looks correct.

      What I think must be going on is that *wt_session_fotxn_discard is freeing the split index too early. That is, for some reason the transaction ID we choose becomes visible to all readers during the *wt_row_search. I can't see why at the moment.

            Assignee:
            keith.bostic@mongodb.com Keith Bostic (Inactive)
            Reporter:
            michael.cahill@mongodb.com Michael Cahill (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated:
              Resolved: