Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-2075

MongoDB/WiredTiger freeze running a combination of MongoDB master and WT Develop

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • WT2.7.0
    • Affects Version/s: None
    • Component/s: None
    • Labels:
      None

      The Jenkins mongodb-perf-ycsb-develop task is currently freezing during the run. This reproduced semi-reliably on the WT Jenkins instance.

      The task in question runs the stdcfg-largedocs entries found in the mongo-tests repo with YCSB. It runs each of 4 workloads sequentially after performing an initial load and I have seen freezes in the 3th and 4th and load portions of these runs.

      PMP output is below and I've attached a trace with line numbers from GDB as gdb-stack.txt

      PMP Stack

      6 sched_yield,__wt_log_slot_join,__wt_log_write,__wt_txn_commit,__session_commit_transaction,mongo::WiredTigerRecoveryUnit::_txnClose(bool),mongo::WiredTigerRecoveryUnit::_commit(),mongo::WriteUnitOfWork::commit(),mongo::UpdateStage::transformAndUpdate(mongo::Snapshotted<mongo::BSONObj>,mongo::UpdateStage::work(unsigned,mongo::PlanExecutor::getNextImpl(mongo::Snapshotted<mongo::BSONObj>*,,mongo::PlanExecutor::getNext(mongo::BSONObj*,,mongo::PlanExecutor::executePlan(),mongo::WriteBatchExecutor::execUpdate(mongo::BatchItemRef,mongo::WriteBatchExecutor::bulkExecute(mongo::BatchedCommandRequest,mongo::WriteBatchExecutor::executeBatch(mongo::BatchedCommandRequest,mongo::WriteCmd::run(mongo::OperationContext*,,mongo::Command::run(mongo::OperationContext*,,mongo::Command::execCommand(mongo::OperationContext*,,mongo::runCommands(mongo::OperationContext*,,mongo::receivedCommand,mongo::assembleResponse(mongo::OperationContext*,,mongo::MyMessageHandler::process(mongo::Message&,,mongo::PortMessageServer::handleIncomingMsg(void*),start_thread,clone
            3 pthread_cond_timedwait@@GLIBC_2.3.2,__wt_cond_wait,__evict_worker,start_thread,clone
            1 select,mongo::Listener::initAndListen(),mongo::initAndListen(int),main
            1 sched_yield,__wt_log_slot_join,__wt_log_write,__wt_txn_commit,__session_commit_transaction,mongo::WiredTigerSizeStorer::syncCache(bool),mongo::WiredTigerKVEngine::syncSizeInfo(bool),mongo::WiredTigerKVEngine::haveDropsQueued(),mongo::WiredTigerSessionCache::releaseSession(mongo::WiredTigerSession*),mongo::WiredTigerRecoveryUnit::~WiredTigerRecoveryUnit(),mongo::WiredTigerRecoveryUnit::~WiredTigerRecoveryUnit(),mongo::OperationContextImpl::~OperationContextImpl(),mongo::TTLMonitor::doTTLPass(),mongo::TTLMonitor::run(),mongo::BackgroundJob::jobBody(),??,start_thread,clone
            1 sched_yield,__log_newfile,__wt_log_acquire,__wt_log_slot_new,__wt_log_slot_switch,__wt_log_write,__wt_txn_commit,__session_commit_transaction,mongo::WiredTigerRecoveryUnit::_txnClose(bool),mongo::WiredTigerRecoveryUnit::_commit(),mongo::WriteUnitOfWork::commit(),mongo::UpdateStage::transformAndUpdate(mongo::Snapshotted<mongo::BSONObj>,mongo::UpdateStage::work(unsigned,mongo::PlanExecutor::getNextImpl(mongo::Snapshotted<mongo::BSONObj>*,,mongo::PlanExecutor::getNext(mongo::BSONObj*,,mongo::PlanExecutor::executePlan(),mongo::WriteBatchExecutor::execUpdate(mongo::BatchItemRef,mongo::WriteBatchExecutor::bulkExecute(mongo::BatchedCommandRequest,mongo::WriteBatchExecutor::executeBatch(mongo::BatchedCommandRequest,mongo::WriteCmd::run(mongo::OperationContext*,,mongo::Command::run(mongo::OperationContext*,,mongo::Command::execCommand(mongo::OperationContext*,,mongo::runCommands(mongo::OperationContext*,,mongo::receivedCommand,mongo::assembleResponse(mongo::OperationContext*,,mongo::MyMessageHandler::process(mongo::Message&,,mongo::PortMessageServer::handleIncomingMsg(void*),start_thread,clone
            1 recv,mongo::Socket::_recv(char*,,mongo::Socket::unsafe_recv(char*,,mongo::Socket::recv(char*,,mongo::MessagingPort::recv(mongo::Message&),mongo::PortMessageServer::handleIncomingMsg(void*),start_thread,clone
            1 pthread_cond_wait@@GLIBC_2.3.2,std::condition_variable::wait(std::unique_lock<std::mutex>&),mongo::DeadlineMonitor<mongo::mozjs::MozJSImplScope>::deadlineMonitorThread(),??,start_thread,clone
            1 pthread_cond_timedwait@@GLIBC_2.3.2,__wt_cond_wait,__sweep_server,start_thread,clone
            1 pthread_cond_timedwait@@GLIBC_2.3.2,__wt_cond_wait,__log_wrlsn_server,start_thread,clone
            1 pthread_cond_timedwait@@GLIBC_2.3.2,__wt_cond_wait,__log_file_server,start_thread,clone
            1 pthread_cond_timedwait@@GLIBC_2.3.2,__wt_cond_wait,__evict_server,start_thread,clone
            1 pthread_cond_timedwait@@GLIBC_2.3.2,mongo::RangeDeleter::doWork(),??,start_thread,clone
            1 pthread_cond_timedwait@@GLIBC_2.3.2,mongo::(anonymous,mongo::BackgroundJob::jobBody(),??,start_thread,clone
            1 nanosleep,mongo::sleepsecs(int),mongo::ClientCursorMonitor::run(),mongo::BackgroundJob::jobBody(),??,start_thread,clone
            1 __lll_lock_wait,_L_lock_816,pthread_mutex_lock,__wt_log_write,__wt_txn_commit,__session_commit_transaction,mongo::WiredTigerRecoveryUnit::_txnClose(bool),mongo::WiredTigerRecoveryUnit::_commit(),mongo::WriteUnitOfWork::commit(),mongo::UpdateStage::transformAndUpdate(mongo::Snapshotted<mongo::BSONObj>,mongo::UpdateStage::work(unsigned,mongo::PlanExecutor::getNextImpl(mongo::Snapshotted<mongo::BSONObj>*,,mongo::PlanExecutor::getNext(mongo::BSONObj*,,mongo::PlanExecutor::executePlan(),mongo::WriteBatchExecutor::execUpdate(mongo::BatchItemRef,mongo::WriteBatchExecutor::bulkExecute(mongo::BatchedCommandRequest,mongo::WriteBatchExecutor::executeBatch(mongo::BatchedCommandRequest,mongo::WriteCmd::run(mongo::OperationContext*,,mongo::Command::run(mongo::OperationContext*,,mongo::Command::execCommand(mongo::OperationContext*,,mongo::runCommands(mongo::OperationContext*,,mongo::receivedCommand,mongo::assembleResponse(mongo::OperationContext*,,mongo::MyMessageHandler::process(mongo::Message&,,mongo::PortMessageServer::handleIncomingMsg(void*),start_thread,clone
            1 __lll_lock_wait,_L_lock_816,pthread_mutex_lock,__wt_log_force_write,__wt_log_ckpt_lsn,__wt_txn_checkpoint_log,__wt_txn_checkpoint,__session_checkpoint,__ckpt_server,start_thread,clone
            1 __lll_lock_wait,_L_lock_816,pthread_mutex_lock,__wt_log_force_write,__log_server,start_thread,clone
            1 do_sigwait,sigwait,mongo::(anonymous,??,start_thread,clone
            1
      

        1. gdb-dump-withnumbers.txt
          107 kB
        2. gdb-stack.txt
          35 kB
        3. log_object_dump.txt
          46 kB

            Assignee:
            david.hows David Hows
            Reporter:
            david.hows David Hows
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated:
              Resolved: