[SERVER-23461] Secondary stay STARTUP2 state because of dead lock Created: 01/Apr/16  Updated: 06/Dec/22  Resolved: 01/Apr/16

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: 3.2.3
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Zhang Youdong Assignee: Backlog - Replication Team
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: HTML File stack    
Issue Links:
Duplicate
duplicates SERVER-23394 AuthorizationManager may deadlock whi... Closed
Assigned Teams:
Replication
Operating System: ALL
Sprint: Repl 13 (04/22/16)
Participants:

 Description   

In my production envrionment, a secondary stay STARTUP2 state all the time because of dead lock.

Thread1

after sync from primary, seconday call AuthzManagerExternalStateLocal::initialize to reload auth, it hold AuthzManagerExternalStateLocal::_roleGraphMutex lock, and then try to acquire read lock of admin database

Thread2

a request lead to creation of admin.system.profile, which will hold write lock of admin database,and then try to acquire AuthzManagerExternalStateLocal::_roleGraphMutex lock.

 
Thread 121 (Thread 0x2b6270a78700 (LWP 1303)):
#0  0x0000003e5ac0e054 in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x0000003e5ac09388 in _L_lock_854 () from /lib64/libpthread.so.0
#2  0x0000003e5ac09257 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3  0x0000000000a7019a in mongo::AuthzManagerExternalStateLocal::AuthzManagerLogOpHandler::commit() ()
#4  0x000000000107e529 in mongo::WiredTigerRecoveryUnit::_commit() ()
#5  0x0000000000aba7f0 in mongo::WriteUnitOfWork::commit() ()
#6  0x0000000000cb75c1 in mongo::createProfileCollection(mongo::OperationContext*, mongo::Database*) ()
#7  0x0000000000cb7ea2 in mongo::profile(mongo::OperationContext*, mongo::NetworkOp) ()
#8  0x0000000000cb41db in mongo::assembleResponse(mongo::OperationContext*, mongo::Message&, mongo::DbResponse&, mongo::HostAndPort const&) ()
#9  0x000000000098b85c in mongo::MyMessageHandler::process(mongo::Message&, mongo::AbstractMessagingPort*) ()
#10 0x00000000012b4d67 in mongo::PortMessageServer::handleIncomingMsg(void*) ()
#11 0x0000003e5ac07851 in start_thread () from /lib64/libpthread.so.0
#12 0x0000003e5a8e767d in clone () from /lib64/libc.so.6

 
Thread 120 (Thread 0x2b6270c79700 (LWP 1304)):
#0  0x0000003e5ac0b7bb in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x0000000000b932a9 in mongo::CondVarLockGrantNotification::wait(unsigned int) ()
#2  0x0000000000b96d77 in mongo::LockerImpl<false>::lockComplete(mongo::ResourceId, mongo::LockMode, unsigned int, bool) ()
#3  0x0000000000b8d2bf in mongo::Lock::DBLock::DBLock(mongo::Locker*, mongo::StringData, mongo::LockMode) ()
#4  0x0000000000ba2d60 in mongo::AutoGetDb::AutoGetDb(mongo::OperationContext*, mongo::StringData, mongo::LockMode) ()
#5  0x0000000000ba2ddd in mongo::AutoGetCollection::AutoGetCollection(mongo::OperationContext*, mongo::NamespaceString const&, mongo::LockMode) ()
#6  0x0000000000ba331d in mongo::AutoGetCollectionForRead::AutoGetCollectionForRead(mongo::OperationContext*, mongo::NamespaceString const&) ()
#7  0x0000000000dd0ad5 in mongo::runQuery(mongo::OperationContext*, mongo::QueryMessage&, mongo::NamespaceString const&, mongo::Message&) ()
#8  0x0000000000cb4130 in mongo::assembleResponse(mongo::OperationContext*, mongo::Message&, mongo::DbResponse&, mongo::HostAndPort const&) ()
#9  0x0000000000baca33 in mongo::DBDirectClient::call(mongo::Message&, mongo::Message&, bool, std::basic_string<char, std::char_traits<char>, std::allocator<char> >*) ()
#10 0x0000000000a0256b in mongo::DBClientCursor::init() ()
#11 0x00000000009e4309 in mongo::DBClientBase::query(std::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, mongo::Query, int, int, mongo::BSONObj const*, int, int) ()
#12 0x0000000000bac686 in mongo::DBDirectClient::query(std::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, mongo::Query, int, int, mongo::BSONObj const*, int, int) ()
#13 0x00000000009e660d in mongo::DBClientBase::query(std::function<void ()(mongo::DBClientCursorBatchIterator&)>, std::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, mongo::Query, mongo::BSONObj const*, int) ()
#14 0x00000000009e6492 in mongo::DBClientBase::query(std::function<void ()(mongo::BSONObj const&)>, std::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, mongo::Query, mongo::BSONObj const*, int) ()
#15 0x0000000000a6908e in mongo::AuthzManagerExternalStateMongod::query(mongo::OperationContext*, mongo::NamespaceString const&, mongo::BSONObj const&, mongo::BSONObj const&, std::function<void ()(mongo::BSONObj const&)> const&) ()
#16 0x0000000000a6baeb in mongo::AuthzManagerExternalStateLocal::_initializeRoleGraph(mongo::OperationContext*) ()
#17 0x0000000000a6bec4 in mongo::AuthzManagerExternalStateLocal::initialize(mongo::OperationContext*) ()
#18 0x0000000000a5df40 in mongo::AuthorizationManager::initialize(mongo::OperationContext*) ()
#19 0x0000000000f1b954 in ?? ()
#20 0x0000000000f1c1df in mongo::repl::syncDoInitialSync() ()
#21 0x0000000000f26710 in mongo::repl::runSyncThread() ()
#22 0x0000000001b0b1e0 in execute_native_thread_routine ()

#23 0x0000003e5ac07851 in start_thread () from /lib64/libpthread.so.0
#24 0x0000003e5a8e767d in clone () from /lib64/libc.so.6



 Comments   
Comment by Scott Hernandez (Inactive) [ 01/Apr/16 ]

We are working on this issue under SERVER-23394.

Once we fix it we will backport to 3.2, so please watch SERVER-23394 for progress and the version that has the fix in it.

Generated at Thu Feb 08 04:03:26 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.