Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-2075

MongoDB/WiredTiger freeze running a combination of MongoDB master and WT Develop

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major - P3
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: WT2.7.0
    • Labels:
      None
    • # Replies:
      35
    • Last comment by Customer:
      true

      Description

      The Jenkins mongodb-perf-ycsb-develop task is currently freezing during the run. This reproduced semi-reliably on the WT Jenkins instance.

      The task in question runs the stdcfg-largedocs entries found in the mongo-tests repo with YCSB. It runs each of 4 workloads sequentially after performing an initial load and I have seen freezes in the 3th and 4th and load portions of these runs.

      PMP output is below and I've attached a trace with line numbers from GDB as gdb-stack.txt

      PMP Stack

      6 sched_yield,__wt_log_slot_join,__wt_log_write,__wt_txn_commit,__session_commit_transaction,mongo::WiredTigerRecoveryUnit::_txnClose(bool),mongo::WiredTigerRecoveryUnit::_commit(),mongo::WriteUnitOfWork::commit(),mongo::UpdateStage::transformAndUpdate(mongo::Snapshotted<mongo::BSONObj>,mongo::UpdateStage::work(unsigned,mongo::PlanExecutor::getNextImpl(mongo::Snapshotted<mongo::BSONObj>*,,mongo::PlanExecutor::getNext(mongo::BSONObj*,,mongo::PlanExecutor::executePlan(),mongo::WriteBatchExecutor::execUpdate(mongo::BatchItemRef,mongo::WriteBatchExecutor::bulkExecute(mongo::BatchedCommandRequest,mongo::WriteBatchExecutor::executeBatch(mongo::BatchedCommandRequest,mongo::WriteCmd::run(mongo::OperationContext*,,mongo::Command::run(mongo::OperationContext*,,mongo::Command::execCommand(mongo::OperationContext*,,mongo::runCommands(mongo::OperationContext*,,mongo::receivedCommand,mongo::assembleResponse(mongo::OperationContext*,,mongo::MyMessageHandler::process(mongo::Message&,,mongo::PortMessageServer::handleIncomingMsg(void*),start_thread,clone
            3 pthread_cond_timedwait@@GLIBC_2.3.2,__wt_cond_wait,__evict_worker,start_thread,clone
            1 select,mongo::Listener::initAndListen(),mongo::initAndListen(int),main
            1 sched_yield,__wt_log_slot_join,__wt_log_write,__wt_txn_commit,__session_commit_transaction,mongo::WiredTigerSizeStorer::syncCache(bool),mongo::WiredTigerKVEngine::syncSizeInfo(bool),mongo::WiredTigerKVEngine::haveDropsQueued(),mongo::WiredTigerSessionCache::releaseSession(mongo::WiredTigerSession*),mongo::WiredTigerRecoveryUnit::~WiredTigerRecoveryUnit(),mongo::WiredTigerRecoveryUnit::~WiredTigerRecoveryUnit(),mongo::OperationContextImpl::~OperationContextImpl(),mongo::TTLMonitor::doTTLPass(),mongo::TTLMonitor::run(),mongo::BackgroundJob::jobBody(),??,start_thread,clone
            1 sched_yield,__log_newfile,__wt_log_acquire,__wt_log_slot_new,__wt_log_slot_switch,__wt_log_write,__wt_txn_commit,__session_commit_transaction,mongo::WiredTigerRecoveryUnit::_txnClose(bool),mongo::WiredTigerRecoveryUnit::_commit(),mongo::WriteUnitOfWork::commit(),mongo::UpdateStage::transformAndUpdate(mongo::Snapshotted<mongo::BSONObj>,mongo::UpdateStage::work(unsigned,mongo::PlanExecutor::getNextImpl(mongo::Snapshotted<mongo::BSONObj>*,,mongo::PlanExecutor::getNext(mongo::BSONObj*,,mongo::PlanExecutor::executePlan(),mongo::WriteBatchExecutor::execUpdate(mongo::BatchItemRef,mongo::WriteBatchExecutor::bulkExecute(mongo::BatchedCommandRequest,mongo::WriteBatchExecutor::executeBatch(mongo::BatchedCommandRequest,mongo::WriteCmd::run(mongo::OperationContext*,,mongo::Command::run(mongo::OperationContext*,,mongo::Command::execCommand(mongo::OperationContext*,,mongo::runCommands(mongo::OperationContext*,,mongo::receivedCommand,mongo::assembleResponse(mongo::OperationContext*,,mongo::MyMessageHandler::process(mongo::Message&,,mongo::PortMessageServer::handleIncomingMsg(void*),start_thread,clone
            1 recv,mongo::Socket::_recv(char*,,mongo::Socket::unsafe_recv(char*,,mongo::Socket::recv(char*,,mongo::MessagingPort::recv(mongo::Message&),mongo::PortMessageServer::handleIncomingMsg(void*),start_thread,clone
            1 pthread_cond_wait@@GLIBC_2.3.2,std::condition_variable::wait(std::unique_lock<std::mutex>&),mongo::DeadlineMonitor<mongo::mozjs::MozJSImplScope>::deadlineMonitorThread(),??,start_thread,clone
            1 pthread_cond_timedwait@@GLIBC_2.3.2,__wt_cond_wait,__sweep_server,start_thread,clone
            1 pthread_cond_timedwait@@GLIBC_2.3.2,__wt_cond_wait,__log_wrlsn_server,start_thread,clone
            1 pthread_cond_timedwait@@GLIBC_2.3.2,__wt_cond_wait,__log_file_server,start_thread,clone
            1 pthread_cond_timedwait@@GLIBC_2.3.2,__wt_cond_wait,__evict_server,start_thread,clone
            1 pthread_cond_timedwait@@GLIBC_2.3.2,mongo::RangeDeleter::doWork(),??,start_thread,clone
            1 pthread_cond_timedwait@@GLIBC_2.3.2,mongo::(anonymous,mongo::BackgroundJob::jobBody(),??,start_thread,clone
            1 nanosleep,mongo::sleepsecs(int),mongo::ClientCursorMonitor::run(),mongo::BackgroundJob::jobBody(),??,start_thread,clone
            1 __lll_lock_wait,_L_lock_816,pthread_mutex_lock,__wt_log_write,__wt_txn_commit,__session_commit_transaction,mongo::WiredTigerRecoveryUnit::_txnClose(bool),mongo::WiredTigerRecoveryUnit::_commit(),mongo::WriteUnitOfWork::commit(),mongo::UpdateStage::transformAndUpdate(mongo::Snapshotted<mongo::BSONObj>,mongo::UpdateStage::work(unsigned,mongo::PlanExecutor::getNextImpl(mongo::Snapshotted<mongo::BSONObj>*,,mongo::PlanExecutor::getNext(mongo::BSONObj*,,mongo::PlanExecutor::executePlan(),mongo::WriteBatchExecutor::execUpdate(mongo::BatchItemRef,mongo::WriteBatchExecutor::bulkExecute(mongo::BatchedCommandRequest,mongo::WriteBatchExecutor::executeBatch(mongo::BatchedCommandRequest,mongo::WriteCmd::run(mongo::OperationContext*,,mongo::Command::run(mongo::OperationContext*,,mongo::Command::execCommand(mongo::OperationContext*,,mongo::runCommands(mongo::OperationContext*,,mongo::receivedCommand,mongo::assembleResponse(mongo::OperationContext*,,mongo::MyMessageHandler::process(mongo::Message&,,mongo::PortMessageServer::handleIncomingMsg(void*),start_thread,clone
            1 __lll_lock_wait,_L_lock_816,pthread_mutex_lock,__wt_log_force_write,__wt_log_ckpt_lsn,__wt_txn_checkpoint_log,__wt_txn_checkpoint,__session_checkpoint,__ckpt_server,start_thread,clone
            1 __lll_lock_wait,_L_lock_816,pthread_mutex_lock,__wt_log_force_write,__log_server,start_thread,clone
            1 do_sigwait,sigwait,mongo::(anonymous,??,start_thread,clone
            1
      

      1. gdb-dump-withnumbers.txt
        107 kB
        David Hows
      2. gdb-stack.txt
        35 kB
        David Hows
      3. log_object_dump.txt
        46 kB
        David Hows

        Issue Links

          Activity

          Hide
          xgen-internal-githook Githook User added a comment -

          Author:

          {u'username': u'michaelcahill', u'name': u'Michael Cahill', u'email': u'michael.cahill@mongodb.com'}

          Message: WT-2075 Push the retry loop down into __wt_log_force_write, minor cleanup to other error handling.
          Branch: develop
          https://github.com/wiredtiger/wiredtiger/commit/e5a1b182c1bac26b50dabfb686b50c3aa9bf1a34

          Show
          xgen-internal-githook Githook User added a comment - Author: {u'username': u'michaelcahill', u'name': u'Michael Cahill', u'email': u'michael.cahill@mongodb.com'} Message: WT-2075 Push the retry loop down into __wt_log_force_write, minor cleanup to other error handling. Branch: develop https://github.com/wiredtiger/wiredtiger/commit/e5a1b182c1bac26b50dabfb686b50c3aa9bf1a34
          Hide
          xgen-internal-githook Githook User added a comment -

          Author:

          {u'username': u'michaelcahill', u'name': u'Michael Cahill', u'email': u'michael.cahill@mongodb.com'}

          Message: Merge pull request #2182 from wiredtiger/wt-2075

          WT-2075 When closing a slot, detect, return and handle different values.
          Branch: develop
          https://github.com/wiredtiger/wiredtiger/commit/85187fcd233763e1bd2dff72ef8d58af23ea3bc7

          Show
          xgen-internal-githook Githook User added a comment - Author: {u'username': u'michaelcahill', u'name': u'Michael Cahill', u'email': u'michael.cahill@mongodb.com'} Message: Merge pull request #2182 from wiredtiger/wt-2075 WT-2075 When closing a slot, detect, return and handle different values. Branch: develop https://github.com/wiredtiger/wiredtiger/commit/85187fcd233763e1bd2dff72ef8d58af23ea3bc7
          Hide
          xgen-internal-githook Githook User added a comment -

          Author:

          {u'username': u'daveh86', u'name': u'David Hows', u'email': u'howsdav@gmail.com'}

          Message: WT-2075 - Add test for log hang with large parallel workload
          Branch: develop
          https://github.com/wiredtiger/wiredtiger/commit/9bb48291c4b230516846ac504c83c9ee02a7af8b

          Show
          xgen-internal-githook Githook User added a comment - Author: {u'username': u'daveh86', u'name': u'David Hows', u'email': u'howsdav@gmail.com'} Message: WT-2075 - Add test for log hang with large parallel workload Branch: develop https://github.com/wiredtiger/wiredtiger/commit/9bb48291c4b230516846ac504c83c9ee02a7af8b
          Hide
          xgen-internal-githook Githook User added a comment -

          Author:

          {u'username': u'daveh86', u'name': u'David Hows', u'email': u'howsdav@gmail.com'}

          Message: WT-2075 - rename test and mod comment
          Branch: develop
          https://github.com/wiredtiger/wiredtiger/commit/81b28e145a96b15be8529fc1b152dc5b9e8f0445

          Show
          xgen-internal-githook Githook User added a comment - Author: {u'username': u'daveh86', u'name': u'David Hows', u'email': u'howsdav@gmail.com'} Message: WT-2075 - rename test and mod comment Branch: develop https://github.com/wiredtiger/wiredtiger/commit/81b28e145a96b15be8529fc1b152dc5b9e8f0445
          Hide
          xgen-internal-githook Githook User added a comment -

          Author:

          {u'username': u'michaelcahill', u'name': u'Michael Cahill', u'email': u'michael.cahill@mongodb.com'}

          Message: Merge pull request #2191 from wiredtiger/wt-2075-test

          WT-2075 - Add test for log hang with large parallel workload
          Branch: develop
          https://github.com/wiredtiger/wiredtiger/commit/fd72a091e76c72dc4c18d349446d9263c5fa71a3

          Show
          xgen-internal-githook Githook User added a comment - Author: {u'username': u'michaelcahill', u'name': u'Michael Cahill', u'email': u'michael.cahill@mongodb.com'} Message: Merge pull request #2191 from wiredtiger/wt-2075-test WT-2075 - Add test for log hang with large parallel workload Branch: develop https://github.com/wiredtiger/wiredtiger/commit/fd72a091e76c72dc4c18d349446d9263c5fa71a3

            People

            • Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:
                Days since reply:
                1 year, 27 weeks, 3 days ago
                Date of 1st Reply: