Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-17378

WiredTiger's compact code can return 'Operation timed out' error (invariant failure)

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • 3.0.1, 3.1.0
    • Affects Version/s: 3.0.0-rc10
    • Component/s: WiredTiger
    • Fully Compatible
    • ALL

      WiredTiger's compact code will return an 'Operation timed out' error if the compact operation exceeds 'max_time'. This timeout can seemingly occur under two conditions:

      1 - The WT compact operation takes longer than 1200 seconds (20 minutes)
      2 - The system clock goes backward during the compact operation

      When the timeout occurs, the error propagates up and leads to an invariant failure:

      Breakpoint 1, __session_compact_check_timeout (session=0x35b4c40, begin=...) at src/third_party/wiredtiger/src/session/session_compact.c:138
      (gdb) l
      133			return (0);
      134	
      135		WT_RET(__wt_epoch(session, &end));
      136		if (session->compact->max_time <
      137		    WT_TIMEDIFF(end, begin) / WT_BILLION)
      138			WT_RET(ETIMEDOUT);
      139		return (0);
      140	}
      141	
      142	/*
      
      ==> WT_TIMEDIFF(end, begin) / WT_BILLION) calculation when the clock goes backward
      (gdb) p (1000000000 * (uint64_t)((end).tv_sec - (begin).tv_sec) + (uint64_t)(end).tv_nsec - (uint64_t)(begin).tv_nsec) / 1000000000
      $4 = 18446744010
      (gdb) p session->compact->max_time
      $5 = 1200
      (gdb) c
      
      Invariant failure: ret resulted in status UnknownError 110: Connection timed out at src/mongo/db/storage/wiredtiger/wiredtiger_record_store.cpp 730
      
      (gdb) bt
      #0  0x00007ffff7bcc20b in raise (sig=5) at ../nptl/sysdeps/unix/sysv/linux/pt-raise.c:37
      #1  0x00000000018f7874 in mongo::breakpoint () at src/mongo/util/debugger.cpp:63
      #2  0x00000000018ed771 in mongo::invariantOKFailed (expr=0x2062531 "ret", status=..., file=0x2062420 "src/mongo/db/storage/wiredtiger/wiredtiger_record_store.cpp", line=730) at src/mongo/util/assert_util.cpp:157
      #3  0x000000000176f0f4 in mongo::WiredTigerRecordStore::compact (this=0x4a2cd00, txn=0x7ffff10f47d0, adaptor=0x0, options=0x7ffff10f2e40, stats=0x7ffff10f2ca0) at src/mongo/db/storage/wiredtiger/wiredtiger_record_store.cpp:730
      #4  0x0000000001282bc6 in mongo::Collection::compact (this=0x5177800, txn=0x7ffff10f47d0, compactOptions=0x7ffff10f2e40) at src/mongo/db/catalog/collection_compact.cpp:118
      #5  0x00000000012c916c in mongo::CompactCmd::run (this=0x28c0d40 <mongo::compactCmd>, txn=0x7ffff10f47d0, db=..., cmdObj=..., errmsg=..., result=..., fromRepl=false) at src/mongo/db/commands/compact.cpp:168
      #6  0x000000000135b619 in mongo::_execCommand (txn=0x7ffff10f47d0, c=0x28c0d40 <mongo::compactCmd>, dbname=..., cmdObj=..., queryOptions=0, errmsg=..., result=..., fromRepl=false) at src/mongo/db/dbcommands.cpp:1316
      #7  0x000000000135c5e7 in mongo::Command::execCommand (txn=0x7ffff10f47d0, c=0x28c0d40 <mongo::compactCmd>, queryOptions=0, cmdns=0x3449414 "db1.$cmd", cmdObj=..., result=..., fromRepl=false) at src/mongo/db/dbcommands.cpp:1532
      #8  0x000000000135ced8 in mongo::_runCommands (txn=0x7ffff10f47d0, ns=0x3449414 "db1.$cmd", _cmdobj=..., b=..., anObjBuilder=..., fromRepl=false, queryOptions=0) at src/mongo/db/dbcommands.cpp:1604
      #9  0x0000000001563e58 in mongo::runCommands (txn=0x7ffff10f47d0, ns=0x3449414 "db1.$cmd", jsobj=..., curop=..., b=..., anObjBuilder=..., fromRepl=false, queryOptions=0) at src/mongo/db/query/find.cpp:137
      #10 0x0000000001565e8a in mongo::runQuery (txn=0x7ffff10f47d0, m=..., q=..., nss=..., curop=..., result=..., fromDBDirectClient=false) at src/mongo/db/query/find.cpp:606
      #11 0x000000000146a7d0 in mongo::receivedQuery (txn=0x7ffff10f47d0, c=..., dbresponse=..., m=..., fromDBDirectClient=false) at src/mongo/db/instance.cpp:220
      #12 0x000000000146b97d in mongo::assembleResponse (txn=0x7ffff10f47d0, m=..., dbresponse=..., remote=..., fromDBDirectClient=false) at src/mongo/db/instance.cpp:403
      #13 0x000000000115d604 in mongo::MyMessageHandler::process (this=0x310a400, m=..., port=0x344f3b0, le=0x50668a0) at src/mongo/db/db.cpp:206
      #14 0x0000000001915cb6 in mongo::PortMessageServer::handleIncomingMsg (arg=0x344f3b0) at src/mongo/util/net/message_server_port.cpp:229
      #15 0x00007ffff7bc4182 in start_thread (arg=0x7ffff10f5700) at pthread_create.c:312
      #16 0x00007ffff6cc500d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
      
      

            Assignee:
            dan@mongodb.com Daniel Pasette (Inactive)
            Reporter:
            kamran.khan Kamran K.
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated:
              Resolved: