WT engine deletes the oplog manager on shutdown while holding the oplog manager mutex.
void WiredTigerKVEngine::deleteOplogManager() { stdx::unique_lock<stdx::mutex> lock(_oplogManagerMutex); invariant(_oplogManagerCount > 0); _oplogManagerCount--; if (_oplogManagerCount == 0) _oplogManager.reset(); }
Oplog manager's destructor waits for _oplogJournalThread to join. However the oplog journal thread may be setting the the oldest timestamp, which needs oplog manager's mutex, thus causing a deadlock.
Thread 22: "WTOplogJournalThread" (Thread 0x7f614ef05700 (LWP 28210)) #0 0x00007f616447d334 in __lll_lock_wait () from /lib64/libpthread.so.0 #1 0x00007f61644785d8 in _L_lock_854 () from /lib64/libpthread.so.0 #2 0x00007f61644784a7 in pthread_mutex_lock () from /lib64/libpthread.so.0 #3 0x00007f6167d285fa in __gthread_mutex_lock (__mutex=0x7f616b274038) at /opt/mongodbtoolchain/v2/include/c++/5.4.0/x86_64-mongodb-linux/bits/gthr-default.h:748 #4 std::mutex::lock (this=0x7f616b274038) at /opt/mongodbtoolchain/v2/include/c++/5.4.0/mutex:135 #5 std::unique_lock<std::mutex>::lock (this=0x7f614ef044c0) at /opt/mongodbtoolchain/v2/include/c++/5.4.0/mutex:485 #6 std::unique_lock<std::mutex>::unique_lock (__m=..., this=0x7f614ef044c0) at /opt/mongodbtoolchain/v2/include/c++/5.4.0/mutex:415 #7 mongo::WiredTigerKVEngine::_setOldestTimestamp (this=0x7f616b274000, oldestTimestamp=...) at src/mongo/db/storage/wiredtiger/wiredtiger_kv_engine.cpp:1026 #8 0x00007f61680cef91 in mongo::repl::ReplicationCoordinatorImpl::_setStableTimestampForStorage_inlock (this=this@entry=0x7f616af03680) at src/mongo/db/repl/replication_coordinator_impl.cpp:3019 #9 0x00007f61680d0b8c in mongo::repl::ReplicationCoordinatorImpl::_updateCommitPoint_inlock (this=this@entry=0x7f616af03680) at src/mongo/db/repl/replication_coordinator_impl.cpp:3044 #10 0x00007f61680d1068 in mongo::repl::ReplicationCoordinatorImpl::_updateLastCommittedOpTime_inlock (this=this@entry=0x7f616af03680) at src/mongo/db/repl/replication_coordinator_impl.cpp:2951 #11 0x00007f61680d1702 in mongo::repl::ReplicationCoordinatorImpl::_setMyLastDurableOpTime_inlock (this=this@entry=0x7f616af03680, opTime=..., isRollbackAllowed=isRollbackAllowed@entry=false) at src/mongo/db/repl/replication_coordinator_impl.cpp:1053 #12 0x00007f61680d182b in mongo::repl::ReplicationCoordinatorImpl::setMyLastDurableOpTimeForward (this=0x7f616af03680, opTime=...) at src/mongo/db/repl/replication_coordinator_impl.cpp:978 #13 0x00007f6167d3f222 in mongo::WiredTigerSessionCache::waitUntilDurable (this=this@entry=0x7f616b277f00, forceCheckpoint=forceCheckpoint@entry=false, stableCheckpoint=stableCheckpoint@entry=false) at src/mongo/db/storage/wiredtiger/wiredtiger_session_cache.cpp:268 #14 0x00007f6167d2c775 in mongo::WiredTigerOplogManager::_oplogJournalThreadLoop (this=0x7f616ea27480, sessionCache=0x7f616b277f00, oplogRecordStore=0x7f616ea30300) at src/mongo/db/storage/wiredtiger/wiredtiger_oplog_manager.cpp:177 #15 0x00007f6169285eb0 in std::execute_native_thread_routine (__p=<optimized out>) at ../../../../../gcc-5.4.0/libstdc++-v3/src/c++11/thread.cc:84 #16 0x00007f6164476aa1 in start_thread () from /lib64/libpthread.so.0 #17 0x00007f61641c3bcd in clone () from /lib64/libc.so.6
Attached is the debugger's log and the lock dependency graph. The join wait is not shown in the graph.