Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-33800

Hang opening 64K cursors on a single table

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Duplicate
    • Affects Version/s: 3.4.9, 3.4.13, 3.6.3
    • Fix Version/s: None
    • Component/s: WiredTiger
    • Labels:
      None
    • Operating System:
      ALL
    • Case:

      Description

      Following script does queries on 10 threads and quickly hangs after printing "starting queries"

      db=/ssd/db
      uri="mongodb://localhost:27017/test?replicaSet=rs"
       
      function clean {
          killall -9 -w mongod
          rm -rf $db
          mkdir -p $db/{r0,r1,r2}
      }
       
      function start {
          killall -w mongod
          opts=""
          mongod --dbpath $db/r0 --port 27017 --logpath $db/r0.log --logappend --replSet rs --fork $opts
          mongod --dbpath $db/r1 --port 27117 --logpath $db/r1.log --logappend --replSet rs --fork $opts
          mongod --dbpath $db/r2 --port 27217 --logpath $db/r2.log --logappend --replSet rs --fork $opts
      }
       
      function initrs {
          mongo --quiet --eval 'rs.initiate({_id: "rs", members: [{_id: 0, host: "127.0.0.1:27017"}]})'
          mongo --quiet --eval 'rs.add("127.0.0.1:27117")'
          mongo --quiet --eval 'rs.add("127.0.0.1:27217")'
      }
       
      function waitrs {
          mongo $uri --quiet --eval "db.version()"
      }
       
      function setup {
          mongo $uri --eval '
              db.c.drop()
              db.c.createIndex({_id: 1, x: 1})
              for (var i = 0; i < 10000; i++)
                  db.c.insert({_id: i, x: i})
          '
      }
       
      function run {
          threads=10
          for i in $(seq $threads); do
              mongo $uri --eval '
                  terms = []
                  for (var i = 0; i < 10000; i++)
                      terms.push({_id: i, x: i})
                  print("starting queries")
                  for (var i = 0; ; i++) {
                      count = db.c.aggregate([{$match: {$or: terms}}]).itcount()
                      print(new Date().toISOString(), i, count)
                  }
              ' &
          done
          wait
      }
       
      clean; start; initrs; waitrs
      setup
      run
      

      Stack traces attached. It appears all threads in wt are either idle or are hung in __wt_readlock:

        10 __wt_cond_wait_signal,__wt_readlock,__wt_session_lock_dhandle,__wt_session_get_btree,__wt_session_get_btree_ckpt,__wt_curfile_open,__session_open_cursor_int,__wt_curtable_open,__session_open_cursor_int,__session_open_cursor,mongo::WiredTigerSession::getCursor(std::__cxx11::basic_string<char,,mongo::WiredTigerCursor::WiredTigerCursor(std::__cxx11::basic_string<char,,mongo::WiredTigerRecordStore::getCursor(mongo::OperationContext*,,mongo::Collection::getCursor(mongo::OperationContext*,,mongo::FetchStage::doWork(unsigned,mongo::PlanStage::work(unsigned,mongo::OrStage::doWork(unsigned,mongo::PlanStage::work(unsigned,mongo::PlanStage::work(unsigned,mongo::PlanExecutor::getNextImpl(mongo::Snapshotted<mongo::BSONObj>*,,mongo::PlanExecutor::getNext(mongo::BSONObj*,,mongo::DocumentSourceCursor::loadBatch(),mongo::DocumentSourceCursor::getNext(),mongo::Pipeline::getNext(),mongo::PipelineProxyStage::getNextBson(),mongo::PipelineProxyStage::doWork(unsigned,mongo::PlanStage::work(unsigned,mongo::PlanExecutor::getNextImpl(mongo::Snapshotted<mongo::BSONObj>*,,mongo::PlanExecutor::getNext(mongo::BSONObj*,,mongo::GetMoreCmd::generateBatch(mongo::ClientCursor*,,mongo::GetMoreCmd::runParsed(mongo::OperationContext*,,mongo::GetMoreCmd::run(mongo::OperationContext*,,mongo::Command::run(mongo::OperationContext*,,mongo::Command::execCommand(mongo::OperationContext*,,mongo::runCommands(mongo::OperationContext*,,mongo::assembleResponse(mongo::OperationContext*,,mongo::ServiceEntryPointMongod::_sessionLoop(std::shared_ptr<mongo::transport::Session>,std::_Function_handler<void,mongo::(anonymous,start_thread,clone
         4 __wt_cond_wait_signal,__wt_cond_auto_wait_signal,__wt_cond_auto_wait,__wt_evict_thread_run,__wt_thread_run,start_thread,clone
         1 __wt_cond_wait_signal,__wt_readlock,__wt_session_lock_dhandle,__wt_session_get_btree,__conn_btree_apply_internal,__wt_conn_btree_apply,__checkpoint_apply_all,__checkpoint_prepare,__txn_checkpoint_wrapper,__wt_txn_checkpoint,__session_checkpoint,__ckpt_server,start_thread,clone
         1 __wt_cond_wait_signal,__wt_cond_auto_wait_signal,__wt_cond_auto_wait,__log_wrlsn_server,start_thread,clone
         1 __wt_cond_wait_signal,__wt_cond_auto_wait_signal,__log_server,start_thread,clone
         1 __wt_cond_wait_signal,__sweep_server,start_thread,clone
         1 __wt_cond_wait_signal,__log_file_server,start_thread,clone
      

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              michael.cahill Michael Cahill
              Reporter:
              bruce.lucas Bruce Lucas
              Participants:
              Votes:
              1 Vote for this issue
              Watchers:
              14 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: