Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-30972

Deadlock in WiredTigerOplogManager on shutdown

    • Type: Icon: Bug Bug
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 3.6.0-rc0
    • Affects Version/s: None
    • Component/s: Storage
    • Labels:
      None
    • Fully Compatible
    • ALL
    • Storage 2017-10-02
    • 0

      WT engine deletes the oplog manager on shutdown while holding the oplog manager mutex.

      void WiredTigerKVEngine::deleteOplogManager() {
          stdx::unique_lock<stdx::mutex> lock(_oplogManagerMutex);
          invariant(_oplogManagerCount > 0);
          _oplogManagerCount--;
          if (_oplogManagerCount == 0)
              _oplogManager.reset();
      }
      

      Oplog manager's destructor waits for _oplogJournalThread to join. However the oplog journal thread may be setting the the oldest timestamp, which needs oplog manager's mutex, thus causing a deadlock.

      Thread 22: "WTOplogJournalThread" (Thread 0x7f614ef05700 (LWP 28210))
      #0  0x00007f616447d334 in __lll_lock_wait () from /lib64/libpthread.so.0
      #1  0x00007f61644785d8 in _L_lock_854 () from /lib64/libpthread.so.0
      #2  0x00007f61644784a7 in pthread_mutex_lock () from /lib64/libpthread.so.0
      #3  0x00007f6167d285fa in __gthread_mutex_lock (__mutex=0x7f616b274038) at /opt/mongodbtoolchain/v2/include/c++/5.4.0/x86_64-mongodb-linux/bits/gthr-default.h:748
      #4  std::mutex::lock (this=0x7f616b274038) at /opt/mongodbtoolchain/v2/include/c++/5.4.0/mutex:135
      #5  std::unique_lock<std::mutex>::lock (this=0x7f614ef044c0) at /opt/mongodbtoolchain/v2/include/c++/5.4.0/mutex:485
      #6  std::unique_lock<std::mutex>::unique_lock (__m=..., this=0x7f614ef044c0) at /opt/mongodbtoolchain/v2/include/c++/5.4.0/mutex:415
      #7  mongo::WiredTigerKVEngine::_setOldestTimestamp (this=0x7f616b274000, oldestTimestamp=...) at src/mongo/db/storage/wiredtiger/wiredtiger_kv_engine.cpp:1026
      #8  0x00007f61680cef91 in mongo::repl::ReplicationCoordinatorImpl::_setStableTimestampForStorage_inlock (this=this@entry=0x7f616af03680) at src/mongo/db/repl/replication_coordinator_impl.cpp:3019
      #9  0x00007f61680d0b8c in mongo::repl::ReplicationCoordinatorImpl::_updateCommitPoint_inlock (this=this@entry=0x7f616af03680) at src/mongo/db/repl/replication_coordinator_impl.cpp:3044
      #10 0x00007f61680d1068 in mongo::repl::ReplicationCoordinatorImpl::_updateLastCommittedOpTime_inlock (this=this@entry=0x7f616af03680) at src/mongo/db/repl/replication_coordinator_impl.cpp:2951
      #11 0x00007f61680d1702 in mongo::repl::ReplicationCoordinatorImpl::_setMyLastDurableOpTime_inlock (this=this@entry=0x7f616af03680, opTime=..., isRollbackAllowed=isRollbackAllowed@entry=false) at src/mongo/db/repl/replication_coordinator_impl.cpp:1053
      #12 0x00007f61680d182b in mongo::repl::ReplicationCoordinatorImpl::setMyLastDurableOpTimeForward (this=0x7f616af03680, opTime=...) at src/mongo/db/repl/replication_coordinator_impl.cpp:978
      #13 0x00007f6167d3f222 in mongo::WiredTigerSessionCache::waitUntilDurable (this=this@entry=0x7f616b277f00, forceCheckpoint=forceCheckpoint@entry=false, stableCheckpoint=stableCheckpoint@entry=false) at src/mongo/db/storage/wiredtiger/wiredtiger_session_cache.cpp:268
      #14 0x00007f6167d2c775 in mongo::WiredTigerOplogManager::_oplogJournalThreadLoop (this=0x7f616ea27480, sessionCache=0x7f616b277f00, oplogRecordStore=0x7f616ea30300) at src/mongo/db/storage/wiredtiger/wiredtiger_oplog_manager.cpp:177
      #15 0x00007f6169285eb0 in std::execute_native_thread_routine (__p=<optimized out>) at ../../../../../gcc-5.4.0/libstdc++-v3/src/c++11/thread.cc:84
      #16 0x00007f6164476aa1 in start_thread () from /lib64/libpthread.so.0
      #17 0x00007f61641c3bcd in clone () from /lib64/libc.so.6
      

      Attached is the debugger's log and the lock dependency graph. The join wait is not shown in the graph.

        1. deadlock-wtoplog.png
          deadlock-wtoplog.png
          24 kB
        2. debugger_mongod_28171.log
          40 kB

            Assignee:
            milkie@mongodb.com Eric Milkie
            Reporter:
            siyuan.zhou@mongodb.com Siyuan Zhou
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: