Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-9486

Fix __tiered_server abort with WT_NOTFOUND

    • Type: Icon: Bug Bug
    • Resolution: Duplicate
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None

      I've seen this error a few times when running on branches wt-7833-ctw2 and wt-7833-concurrent-tiered-writers.  (These branches address, in two different ways, some known issues with running longer tiered tests).  The error is at the end of __tiered_server, in its err: label, so we don't know exactly the root of the error. Here's the backtrace:

      (gdb) bt
      #0  0x00007fdce531ef47 in raise () from /lib/x86_64-linux-gnu/libc.so.6
      #1  0x00007fdce53208b1 in abort () from /lib/x86_64-linux-gnu/libc.so.6
      #2  0x00007fdce5adee6c in __wt_abort (session=0x556ee1c8ded0)
          at /home/ubuntu/wt/git/wt-7833-ctw2/src/os_common/os_abort.c:30
      #3  0x00007fdce5b5c01f in __wt_panic_func (session=0x556ee1c8ded0, error=-31803, 
          func=0x7fdce5bcc820 <__PRETTY_FUNCTION__.38437> "__tiered_server", line=626, 
          category=WT_VERB_DEFAULT, fmt=0x7fdce5bcc6cd "storage server error")
          at /home/ubuntu/wt/git/wt-7833-ctw2/src/support/err.c:550
      #4  0x00007fdce5a0c7d2 in __tiered_server (arg=0x556ee1c8ded0)
          at /home/ubuntu/wt/git/wt-7833-ctw2/src/conn/conn_tiered.c:626
      
      (gdb) up
      #4  0x00007fdce5a0c7d2 in __tiered_server (arg=0x556ee1c8ded0)
          at /home/ubuntu/wt/git/wt-7833-ctw2/src/conn/conn_tiered.c:626
      626             WT_IGNORE_RET(__wt_panic(session, ret, "storage server error"));
      (gdb) p ret
      $1 = -31803
      (gdb) list
      621             }
      622         }
      623
      624         if (0) {
      625     err:
      626             WT_IGNORE_RET(__wt_panic(session, ret, "storage server error"));
      627         }
      628         __wt_buf_free(session, &path);
      629         __wt_buf_free(session, &tmp);
      630         return (WT_THREAD_RET_VALUE);

      Here's how I trigger the error. It usually takes a few dozen or so iterations to hit problem:

      builddir=`pwd`/buildcd build/test/csuite/tiered_abort || exit 1
      i=0
      nruns=500
      while [ "$i" != $nruns ]; do
        echo
        echo '========================='
        echo Iteration "$i"
       ./test_tiered_abort -b ~/wt/git/wt-7833-concurrent-tiered-writers/build -h ./TEST_DIR || exit 1
       ./test_tiered_abort -b ~/wt/git/wt-7833-concurrent-tiered-writers/build -h ./TEST_DIR -t "$i" || 
      exit 1
       i=`expr $i + 1`
      done

            Assignee:
            sue.loverso@mongodb.com Susan LoVerso
            Reporter:
            donald.anderson@mongodb.com Donald Anderson
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: