[SERVER-25210] Deadlock in Master/Slave Startup on Windows 2008 R2 Created: 22/Jul/16  Updated: 18/Apr/17  Resolved: 25/Jul/16

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: 3.3.10
Fix Version/s: 3.2.13, 3.3.11

Type: Bug Priority: Major - P3
Reporter: Mark Benvenuto Assignee: Mark Benvenuto
Resolution: Done Votes: 1
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Backwards Compatibility: Fully Compatible
Operating System: ALL
Sprint: Platforms 18 (08/05/16)
Participants:

 Description   

Due to changes in FTDC, if time until the listener is started increases too much, master/slave replication will deadlock. The replication thread owns the database lock and waits for the listener to start. The main thread waits on the database thread before trying to start the listener.

I have only seen this on Windows 2008R2+ Debug builds.

Main Thread

0:000> kc
 # Call Site
00 ntdll!ZwWaitForKeyedEvent
01 ntdll!RtlSleepConditionVariableSRW
02 kernel32!SleepConditionVariableSRW
03 mongod!__crtSleepConditionVariableSRW
04 mongod!Concurrency::details::stl_condition_variable_win7::wait_for
05 mongod!do_wait
06 mongod!_Cnd_timedwait
07 mongod!std::_Cnd_timedwaitX
08 mongod!std::condition_variable::wait_until
09 mongod!std::condition_variable::wait_for
0a mongod!mongo::CondVarLockGrantNotification::wait
0b mongod!mongo::LockerImpl<1>::lockComplete
0c mongod!mongo::LockerImpl<1>::lockGlobalComplete
0d mongod!mongo::Lock::GlobalLock::waitForLock
0e mongod!mongo::Lock::GlobalLock::GlobalLock
0f mongod!mongo::Lock::GlobalWrite::{ctor}
10 mongod!mongo::logStartup
11 mongod!mongo::_initAndListen
12 mongod!mongo::initAndListen
13 mongod!mongoDbMain
14 mongod!wmain
15 mongod!invoke_main
16 mongod!__scrt_common_main_seh
17 mongod!__scrt_common_main
18 mongod!wmainCRTStartup
19 kernel32!BaseThreadInitThunk
1a ntdll!RtlUserThreadStart

Replication Thread

  10  Id: 143c.9d8 Suspend: 0 Teb: 000007ff`fffa6000 Unfrozen
 # Call Site
00 ntdll!ZwWaitForKeyedEvent
01 ntdll!RtlSleepConditionVariableSRW
02 kernel32!SleepConditionVariableSRW
03 mongod!__crtSleepConditionVariableSRW
04 mongod!Concurrency::details::stl_condition_variable_win7::wait_for
05 mongod!Concurrency::details::stl_condition_variable_win7::wait
06 mongod!do_wait
07 mongod!_Cnd_wait
08 mongod!std::_Cnd_waitX
09 mongod!std::condition_variable::wait
0a mongod!mongo::Listener::waitUntilListening
0b mongod!mongo::repl::isSelf
0c mongod!mongo::Cloner::copyDb
0d mongod!mongo::repl::ReplSource::resync
0e mongod!mongo::repl::ReplSource::_sync_pullOpLog_applyOperation
0f mongod!mongo::repl::ReplSource::_sync_pullOpLog
10 mongod!mongo::repl::ReplSource::sync
11 mongod!mongo::repl::_replMain
12 mongod!mongo::repl::replMain
13 mongod!mongo::repl::replSlaveThread
14 mongod!std::_Invoker_functor::_Call
15 mongod!std::invoke
16 mongod!std::_LaunchPad<std::unique_ptr<std::tuple<void (__cdecl*)(void)>,std::default_delete<std::tuple<void (__cdecl*)(void)> > > >::_Execute
17 mongod!std::_LaunchPad<std::unique_ptr<std::tuple<void (__cdecl*)(void)>,std::default_delete<std::tuple<void (__cdecl*)(void)> > > >::_Run
18 mongod!std::_Pad::_Call_func
19 mongod!invoke_thread_procedure
1a mongod!thread_start<unsigned int (__cdecl*)(void * __ptr64)>
1b kernel32!BaseThreadInitThunk
1c ntdll!RtlUserThreadStart



 Comments   
Comment by Githook User [ 17/Apr/17 ]

Author:

{u'username': u'markbenvenuto', u'name': u'Mark Benvenuto', u'email': u'mark.benvenuto@mongodb.com'}

Message: SERVER-25210 Deadlock in Master/Slave Startup

(cherry picked from commit 34aca01d24361e1a71c0888ba5bbce451df7ce05)
Branch: v3.2
https://github.com/mongodb/mongo/commit/01f3ea709b47de751f2927e0e6a0b9a919ad3b09

Comment by Githook User [ 25/Jul/16 ]

Author:

{u'username': u'markbenvenuto', u'name': u'Mark Benvenuto', u'email': u'mark.benvenuto@mongodb.com'}

Message: SERVER-25210 Deadlock in Master/Slave Startup
Branch: master
https://github.com/mongodb/mongo/commit/34aca01d24361e1a71c0888ba5bbce451df7ce05

Generated at Thu Feb 08 04:08:34 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.