Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-3890

core dump walking timestamp queue

    • Type: Icon: Bug Bug
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 3.6.3, 3.7.2, WT3.1.0
    • Affects Version/s: None
    • Component/s: None
    • Labels:
      None
    • Storage 2018-02-12

      format failure, test run #39208:

      http://build.wiredtiger.com:8080/job/wiredtiger-test-format-stress-zseries/39208/consoleFull

      It looks like we're attempting to remove a WT_TXN structure from the timestamp queue, but it's not currently linked onto the timestamp queue:

      (gdb) where
      #0  0x00000000800ea564 in __wt_txn_set_commit_timestamp (session=0x9cd65760)
          at ../src/txn/txn_timestamp.c:697
      #1  0x00000000800d272a in __wt_txn_commit (session=0x9cd65760, 
          cfg=0x3ff79ffcd70) at ../src/txn/txn.c:724
      #2  0x00000000800b0e06 in __session_commit_transaction (wt_session=0x9cd65760, 
          config=0x3ff79ffce7c "commit_timestamp=e7e0e")
          at ../src/session/session_api.c:1507
      #3  0x000000008000a34e in commit_transaction (tinfo=0x9dd0ba00, 
          session=0x9cd65760) at ../../../test/format/ops.c:525
      #4  0x000000008000b770 in ops (arg=0x9dd0ba00)
          at ../../../test/format/ops.c:915
      #5  0x000003ff844881f2 in start_thread () from /lib64/libpthread.so.0
      #6  0x000003ff842098da in thread_start () from /lib64/libc.so.6
      (gdb) frame 0
      #0  0x00000000800ea564 in __wt_txn_set_commit_timestamp (session=0x9cd65760)
          at ../src/txn/txn_timestamp.c:697
      697					TAILQ_REMOVE(&txn_global->commit_timestamph,
      (gdb) l
      692		} else {
      693			TAILQ_FOREACH_SAFE(qtxn, &txn_global->commit_timestamph,
      694			    commit_timestampq, txn_tmp) {
      695				if (qtxn->clear_ts_queue) {
      696					qtxn->clear_ts_queue = false;
      697					TAILQ_REMOVE(&txn_global->commit_timestamph,
      698					    qtxn, commit_timestampq);
      699					--txn_global->commit_timestampq_len;
      700					continue;
      701				}
      (gdb) p qtxn.commit_timestampq
      $23 = {tqe_next = 0x0, tqe_prev = 0x0}
      (gdb) p *qtxn.commit_timestampq.tqe_prev
      Cannot access memory at address 0x0
      

      and the *(elm)->field.tqe_prev = TAILQ_NEXT((elm), field); portion of TAILQ_REMOVE will indirect through a NULL pointer.

            Assignee:
            sue.loverso@mongodb.com Susan LoVerso
            Reporter:
            keith.bostic@mongodb.com Keith Bostic (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: