-
Type: Bug
-
Resolution: Done
-
Priority: Critical - P2
-
Affects Version/s: 2.7.8
-
Component/s: Storage, Testing Infrastructure
-
None
-
ALL
[2014/10/30 03:52:53.853] Thread 7 (Thread 0x41097940 (LWP 15153)): [2014/10/30 03:52:53.853] #0 0x00007fd9b2976150 in pthread_cond_timedwait@@GLIBC_2.3.2 () [2014/10/30 03:52:53.853] from /lib64/libpthread.so.0 [2014/10/30 03:52:53.853] #1 0x00000000009dd79a in boost::condition_variable::timed_wait(boost::unique_lock<boost::mutex>&, boost::posix_time::ptime const&) () [2014/10/30 03:52:53.853] #2 0x00000000009d609d in mongo::CondVarLockGrantNotification::wait(unsigned int) () [2014/10/30 03:52:53.853] #3 0x00000000009ddb73 in mongo::LockerImpl<true>::lock(mongo::ResourceId const&, mongo::LockMode, unsigned int, bool) () [2014/10/30 03:52:53.853] #4 0x00000000009d9190 in mongo::LockerImpl<true>::endWriteUnitOfWork() () [2014/10/30 03:52:53.853] #5 0x000000000092b08c in mongo::WriteUnitOfWork::commit() () [2014/10/30 03:52:53.853] #6 0x0000000000d6eff8 in mongo::MMAPV1DatabaseCatalogEntry::MMAPV1DatabaseCatalogEntry(mongo::OperationContext*, mongo::StringData const&, mongo::StringData const&, bool, bool) () [2014/10/30 03:52:53.853] #7 0x0000000000d731ef in mongo::MMAPV1Engine::getDatabaseCatalogEntry(mongo::OperationContext*, mongo::StringData const&) () [2014/10/30 03:52:53.853] #8 0x00000000009371a4 in mongo::DatabaseHolder::openDb(mongo::OperationContext*, mongo::StringData const&, bool*) () [2014/10/30 03:52:53.853] #9 0x0000000000943055 in mongo::Client::Context::_finishInit() () [2014/10/30 03:52:53.853] #10 0x0000000000943111 in mongo::Client::Context::Context(mongo::OperationContext*, std::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool) () [2014/10/30 03:52:53.853] #11 0x0000000000abea48 in mongo::receivedInsert(mongo::OperationContext*, mongo::Message&, mongo::CurOp&) () [2014/10/30 03:52:53.853] #12 0x0000000000ac09d0 in mongo::assembleResponse(mongo::OperationContext*, mongo::Message&, mongo::DbResponse&, mongo::HostAndPort const&, bool) () [2014/10/30 03:52:53.853] #13 0x000000000081b722 in mongo::MyMessageHandler::process(mongo::Message&, mongo::AbstractMessagingPort*, mongo::LastError*) () [2014/10/30 03:52:53.853] #14 0x0000000000f12641 in mongo::PortMessageServer::handleIncomingMsg(void*) [2014/10/30 03:52:53.853] () [2014/10/30 03:52:53.853] Thread 6 (Thread 0x47a15940 (LWP 15155)): [2014/10/30 03:52:53.853] #0 0x00007fd9b29784c4 in __lll_lock_wait () from /lib64/libpthread.so.0 [2014/10/30 03:52:53.853] #1 0x00007fd9b2973e1a in _L_lock_1034 () from /lib64/libpthread.so.0 [2014/10/30 03:52:53.853] #2 0x00007fd9b2973cdc in pthread_mutex_lock () from /lib64/libpthread.so.0 [2014/10/30 03:52:53.853] #3 0x00000000008a7480 in boost::unique_lock<boost::mutex>::lock() () [2014/10/30 03:52:53.853] #4 0x0000000000d730ff in mongo::MMAPV1Engine::getDatabaseCatalogEntry(mongo::OperationContext*, mongo::StringData const&) () [2014/10/30 03:52:53.853] #5 0x00000000009371a4 in mongo::DatabaseHolder::openDb(mongo::OperationContext*, mongo::StringData const&, bool*) () [2014/10/30 03:52:53.853] #6 0x0000000000943055 in mongo::Client::Context::_finishInit() () [2014/10/30 03:52:53.853] #7 0x0000000000943111 in mongo::Client::Context::Context(mongo::OperationContext*, std::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool) () [2014/10/30 03:52:53.853] #8 0x0000000000abea48 in mongo::receivedInsert(mongo::OperationContext*, mongo::Message&, mongo::CurOp&) () [2014/10/30 03:52:53.853] #9 0x0000000000ac09d0 in mongo::assembleResponse(mongo::OperationContext*, mongo::Message&, mongo::DbResponse&, mongo::HostAndPort const&, bool) () [2014/10/30 03:52:53.853] #10 0x000000000081b722 in mongo::MyMessageHandler::process(mongo::Message&, mongo::AbstractMessagingPort*, mongo::LastError*) () [2014/10/30 03:52:53.853] #11 0x0000000000f12641 in mongo::PortMessageServer::handleIncomingMsg(void*) [2014/10/30 03:52:53.853] () [2014/10/30 03:52:53.850] Thread 19 (Thread 0x42f85940 (LWP 13870)): [2014/10/30 03:52:53.850] #0 0x00007fd9b2976150 in pthread_cond_timedwait@@GLIBC_2.3.2 () [2014/10/30 03:52:53.850] from /lib64/libpthread.so.0 [2014/10/30 03:52:53.850] #1 0x00000000009dd79a in boost::condition_variable::timed_wait(boost::unique_lock<boost::mutex>&, boost::posix_time::ptime const&) () [2014/10/30 03:52:53.850] #2 0x00000000009d609d in mongo::CondVarLockGrantNotification::wait(unsigned int) () [2014/10/30 03:52:53.850] #3 0x00000000009ddb73 in mongo::LockerImpl<true>::lock(mongo::ResourceId const&, mongo::LockMode, unsigned int, bool) () [2014/10/30 03:52:53.850] #4 0x00000000009d588c in mongo::AutoAcquireFlushLockForMMAPV1Commit::AutoAcquireFlushLockForMMAPV1Commit(mongo::Locker*) () [2014/10/30 03:52:53.850] #5 0x0000000000d4b4aa in ?? () [2014/10/30 03:52:53.850] #6 0x0000000000f9fda4 in ?? ()
Thread 6: Holds Global (IX), Flush (IX), DB1 (X) and calls into MMAPV1Engine::getDatabaseCatalogEntry. This blocks on the DB catalog entry mutex, which is held by Thread 7.
Flush (Thread 19): Tries to acquire Flush (S) and blocks behind Thread 6.
Thread 7: Holds Global(IX), DB2 (X), DB catalog entry mutex and tries to commit. Commit tries to acquire Flush (IX) back and blocks, because the flush thread is ahead of it in the queue for the Flush lock.
This is a deadlock, which is not discoverable by the deadlock detector, because the DB catalog entry mutex is not a managed lock resource.
We need to ensure that commits are not happening under this lock or that the end of write unit of work does not try to acquire the flush lock again (which might be problematic to do).
- depends on
-
SERVER-15888 Implicit DB creations on insert and update code paths happen under DB IX lock
- Closed