Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-22169

Deadlock during CatalogManager swap from SCCC -> CSRS

    • Fully Compatible
    • ALL
    • Sharding F (01/29/16)
    • 0

      pinger thread trying to ping and blocked on ForwardingCatalog mutex

       [2016/01/12 23:49:28.087] 00000000`0357bca8 000007fe`fd4610dc ntdll!ZwWaitForSingleObject+0xa
       [2016/01/12 23:49:28.087] 00000000`0357bcb0 00000001`3fc90439 KERNELBASE!WaitForSingleObjectEx+0x9c
       [2016/01/12 23:49:28.087] 00000000`0357bd50 00000001`3fc3b9ae mongos!Concurrency::details::ExternalContextBase::Block+0xc9
       [2016/01/12 23:49:28.087] 00000000`0357bd90 00000001`3fc0b708 mongos!Concurrency::Context::Block+0x1e
       [2016/01/12 23:49:28.087] 00000000`0357bdd0 00000001`3fc0c10e mongos!Concurrency::details::LockQueueNode::Block+0x178
       [2016/01/12 23:49:28.087] 00000000`0357be40 00000001`3fc0cd74 mongos!Concurrency::critical_section::_Acquire_lock+0xde
       [2016/01/12 23:49:28.087] 00000000`0357beb0 00000001`3fbae7f2 mongos!Concurrency::critical_section::lock+0x34
       [2016/01/12 23:49:28.087] 00000000`0357bf10 00000001`3fbaec55 mongos!mtx_do_lock+0xf2
       [2016/01/12 23:49:28.087] 00000000`0357bf90 00000001`3f8d1cc8 mongos!_Mtx_lock+0x15
       [2016/01/12 23:49:28.090] 00000000`0357bfc0 00000001`3f993463 mongos!mongo::ForwardingCatalogManager::scheduleReplaceCatalogManagerIfNeeded+0x68
       [2016/01/12 23:49:28.090] 00000000`0357c2e0 00000001`3fa1b062 mongos!mongo::ShardingNetworkConnectionHook::validateHostImpl+0x553
       [2016/01/12 23:49:28.090] 00000000`0357c5b0 00000001`3fa1a9e6 mongos!<lambda_e7fd035657d63ee1e352dc2700c55463>::operator()+0x22
       [2016/01/12 23:49:28.090] 00000000`0357c5f0 00000001`3fa1b0da mongos!std::_Callable_obj<<lambda_e7fd035657d63ee1e352dc2700c55463>,0>::_ApplyX<mongo::Status,mongo::HostAndPort const & __ptr64,mongo::executor::RemoteCommandResponse const & __ptr64>+0x16
       [2016/01/12 23:49:28.090] 00000000`0357c630 00000001`3f57bca0 mongos!std::_Func_impl<std::_Callable_obj<<lambda_e7fd035657d63ee1e352dc2700c55463>,0>,std::allocator<std::_Func_class<mongo::Status,mongo::HostAndPort const & __ptr64,mongo::executor::RemoteCommandResponse const & __ptr64> >,mongo::Status,mongo::HostAndPort const & __ptr64,mongo::executor::RemoteCommandResponse const & __ptr64>::_Do_call+0x1a
       [2016/01/12 23:49:28.092] 00000000`0357c670 00000001`3f5e7194 mongos!std::_Func_class<mongo::Status,mongo::executor::RemoteCommandResponse const & __ptr64>::operator()+0x20
       [2016/01/12 23:49:28.092] 00000000`0357c6b0 00000001`3f5e5de6 mongos!<lambda_4283abedef6970da99f157f752d96f21>::operator()+0x24
       [2016/01/12 23:49:28.092] 00000000`0357c6f0 00000001`3f5e734a mongos!std::_Callable_obj<<lambda_4283abedef6970da99f157f752d96f21>,0>::_ApplyX<mongo::Status,mongo::executor::RemoteCommandResponse const & __ptr64>+0x16
       [2016/01/12 23:49:28.092] 00000000`0357c730 00000001`3f57bca0 mongos!std::_Func_impl<std::_Callable_obj<<lambda_4283abedef6970da99f157f752d96f21>,0>,std::allocator<std::_Func_class<mongo::Status,mongo::executor::RemoteCommandResponse const & __ptr64> >,mongo::Status,mongo::executor::RemoteCommandResponse const & __ptr64>::_Do_call+0x1a
       [2016/01/12 23:49:28.092] 00000000`0357c770 00000001`3f57eb1d mongos!std::_Func_class<mongo::Status,mongo::executor::RemoteCommandResponse const & __ptr64>::operator()+0x20
       [2016/01/12 23:49:28.092] 00000000`0357c7b0 00000001`3f57d2c5 mongos!mongo::DBClientConnection::connect+0x2ad
       [2016/01/12 23:49:28.096] 00000000`0357c8c0 00000001`3f57e40f mongos!mongo::DBClientConnection::_checkConnection+0x295
       [2016/01/12 23:49:28.096] 00000000`0357cfd0 00000001`3f5892c1 mongos!mongo::DBClientConnection::call+0x4f
       [2016/01/12 23:49:28.096] 00000000`0357d0a0 00000001`3f588eae mongos!mongo::DBClientWithCommands::runCommandWithMetadata+0x301
       [2016/01/12 23:49:28.096] 00000000`0357d6f0 00000001`3f588ca0 mongos!mongo::DBClientWithCommands::runCommand+0x19e
       [2016/01/12 23:49:28.096] 00000000`0357d960 00000001`3f58a07c mongos!mongo::DBClientConnection::runCommand+0x20
       [2016/01/12 23:49:28.096] 00000000`0357d9c0 00000001`3f5ee207 mongos!mongo::DBClientWithCommands::simpleCommand+0x13c
       [2016/01/12 23:49:28.096] 00000000`0357dab0 00000001`3f5f0118 mongos!mongo::SyncClusterConnection::prepare+0x227
       [2016/01/12 23:49:28.096] 00000000`0357dc30 00000001`3f58a88e mongos!mongo::SyncClusterConnection::update+0x108
       [2016/01/12 23:49:28.096] 00000000`0357dfd0 00000001`3f906944 mongos!mongo::DBClientBase::update+0x9e
       [2016/01/12 23:49:28.096] 00000000`0357e040 00000001`3f908a29 mongos!mongo::LegacyDistLockPinger::_distLockPingThread+0x674
       [2016/01/12 23:49:28.096] 00000000`0357f520 00000001`3f906088 mongos!mongo::LegacyDistLockPinger::distLockPingThread+0x99
       [2016/01/12 23:49:28.096] 00000000`0357f640 00000001`3f904b66 mongos!std::_Pmf_wrap<void (__cdecl mongo::LegacyDistLockPinger::*)(mongo::ConnectionString,__int64,std::basic_string<char,std::char_traits<char>,std::allocator<char> > const & __ptr64,std::chrono::duration<__int64,std::ratio<1,1000> >) __ptr64,void,mongo::LegacyDistLockPinger,mongo::ConnectionString,__int64,std::basic_string<char,std::char_traits<char>,std::allocator<char> > const & __ptr64,std::chrono::duration<__int64,std::ratio<1,1000> > >::operator()+0xa8
       [2016/01/12 23:49:28.097] 00000000`0357f750 00000001`3f9061b2 mongos!std::_Bind<1,void,std::_Pmf_wrap<void (__cdecl mongo::LegacyDistLockPinger::*)(mongo::ConnectionString,__int64,std::basic_string<char,std::char_traits<char>,std::allocator<char> > const & __ptr64,std::chrono::duration<__int64,std::ratio<1,1000> >) __ptr64,void,mongo::LegacyDistLockPinger,mongo::ConnectionString,__int64,std::basic_string<char,std::char_traits<char>,std::allocator<char> > const & __ptr64,std::chrono::duration<__int64,std::ratio<1,1000> > >,mongo::LegacyDistLockPinger * __ptr64 const,mongo::ConnectionString const & __ptr64,__int64,std::basic_string<char,std::char_traits<char>,std::allocator<char> > const & __ptr64,std::chrono::duration<__int64,std::ratio<1,1000> > & __ptr64>::_Do_call<,0,1,2,3,4>+0x86
       [2016/01/12 23:49:28.098] 00000000`0357f810 00000001`3fbafbf9 mongos!std::_LaunchPad<std::_Bind<0,void,std::_Bind<1,void,std::_Pmf_wrap<void (__cdecl mongo::LegacyDistLockPinger::*)(mongo::ConnectionString,__int64,std::basic_string<char,std::char_traits<char>,std::allocator<char> > const & __ptr64,std::chrono::duration<__int64,std::ratio<1,1000> >) __ptr64,void,mongo::LegacyDistLockPinger,mongo::ConnectionString,__int64,std::basic_string<char,std::char_traits<char>,std::allocator<char> > const & __ptr64,std::chrono::duration<__int64,std::ratio<1,1000> > >,mongo::LegacyDistLockPinger * __ptr64 const,mongo::ConnectionString const & __ptr64,__int64,std::basic_string<char,std::char_traits<char>,std::allocator<char> > const & __ptr64,std::chrono::duration<__int64,std::ratio<1,1000> > & __ptr64> > >::_Run+0x72
       [2016/01/12 23:49:28.098] 00000000`0357f930 00000001`3fc02ad5 mongos!_Call_func+0x29
       [2016/01/12 23:49:28.098] 00000000`0357f980 00000001`3fc02d27 mongos!_callthreadstartex+0x25
       [2016/01/12 23:49:28.098] 00000000`0357f9d0 00000000`773159dd mongos!_threadstartex+0xe7
       [2016/01/12 23:49:28.098] 00000000`0357fa10 00000000`7744a631 kernel32!BaseThreadInitThunk+0xd
       [2016/01/12 23:49:28.102] 00000000`0357fa40 00000000`00000000 ntdll!RtlUserThreadStart+0x21
       [2016/01/12 23:49:28.102]   11  Id: 1f20.4dc Suspend: 1 Teb: 000007ff`fffa0000 Unfrozen
      

      another thread who has the ForwardingCatalog mutex swapping the catalog manager, and joining the pinger thread

       [2016/01/12 23:49:29.884] 00000000`01b3eb38 000007fe`fd4610dc ntdll!ZwWaitForSingleObject+0xa
       [2016/01/12 23:49:29.884] 00000000`01b3eb40 00000001`3fbaf651 KERNELBASE!WaitForSingleObjectEx+0x9c
       [2016/01/12 23:49:29.884] 00000000`01b3ebe0 00000001`3f908fbe mongos!_Thrd_join+0x21
       [2016/01/12 23:49:29.884] 00000000`01b3ec20 00000001`3f903f19 mongos!mongo::LegacyDistLockPinger::shutdown+0x20e
       [2016/01/12 23:49:29.884] 00000000`01b3ece0 00000001`3f8ebec1 mongos!mongo::LegacyDistLockManager::shutDown+0x99
       [2016/01/12 23:49:29.884] 00000000`01b3ed30 00000001`3f8d0451 mongos!mongo::CatalogManagerLegacy::shutDown+0x1c1
       [2016/01/12 23:49:29.884] 00000000`01b3ee60 00000001`3f881a61 mongos!mongo::ForwardingCatalogManager::_replaceCatalogManager+0xe1
       [2016/01/12 23:49:29.884] 00000000`01b3f000 00000001`3f87ee1c mongos!mongo::executor::ThreadPoolTaskExecutor::runCallback+0x1c1
       [2016/01/12 23:49:29.884] 00000000`01b3f150 00000001`3f87a1c2 mongos!std::_Func_impl<std::_Callable_obj<<lambda_fe181a3c479b1e45079f5979bb60381d>,0>,std::allocator<std::_Func_class<void> >,void>::_Do_call+0x3c
       [2016/01/12 23:49:29.884] 00000000`01b3f190 00000001`3f87966e mongos!mongo::executor::NetworkInterfaceThreadPool::consumeTasks+0x372
       [2016/01/12 23:49:29.888] 00000000`01b3f320 00000001`3f869489 mongos!std::_Callable_obj<<lambda_549362b12c6070a89a43df20670b117a>,0>::_ApplyX<void>+0x5e
       [2016/01/12 23:49:29.888] 00000000`01b3f380 00000001`3f866526 mongos!<lambda_d2058e60f28439f6439d4cbf1e88bbb1>::operator()+0x49
       [2016/01/12 23:49:29.888] 00000000`01b3f470 00000001`3f86c19d mongos!asio::asio_handler_invoke<asio::detail::binder1<<lambda_d2058e60f28439f6439d4cbf1e88bbb1>,std::error_code> >+0x26
       [2016/01/12 23:49:29.888] 00000000`01b3f4b0 00000001`3fadba1e mongos!asio::detail::wait_handler<<lambda_d2058e60f28439f6439d4cbf1e88bbb1> >::do_complete+0x9d
       [2016/01/12 23:49:29.888] 00000000`01b3f560 00000001`3fae06cb mongos!asio::detail::win_iocp_io_service::do_one+0x2be
       [2016/01/12 23:49:29.888] 00000000`01b3f640 00000001`3fae01e2 mongos!asio::detail::win_iocp_io_service::run+0xbb
       [2016/01/12 23:49:29.888] 00000000`01b3f690 00000001`3f869401 mongos!asio::io_service::run+0x32
       [2016/01/12 23:49:29.888] 00000000`01b3f730 00000001`3f86a00c mongos!<lambda_c44aeeec9ed6b3fe32a46d4f069876e0>::operator()+0x361
       [2016/01/12 23:49:29.888] 00000000`01b3f970 00000001`3fbafbf9 mongos!std::_LaunchPad<std::_Bind<0,void,<lambda_c44aeeec9ed6b3fe32a46d4f069876e0> > >::_Go+0x1c
       [2016/01/12 23:49:29.891] 00000000`01b3f9c0 00000001`3fc02ad5 mongos!_Call_func+0x29
       [2016/01/12 23:49:29.891] 00000000`01b3fa10 00000001`3fc02d27 mongos!_callthreadstartex+0x25
       [2016/01/12 23:49:29.891] 00000000`01b3fa60 00000000`773159dd mongos!_threadstartex+0xe7
       [2016/01/12 23:49:29.891] 00000000`01b3faa0 00000000`7744a631 kernel32!BaseThreadInitThunk+0xd
       [2016/01/12 23:49:29.891] 00000000`01b3fad0 00000000`00000000 ntdll!RtlUserThreadStart+0x21
       [2016/01/12 23:49:29.891]    5  Id: 1cc0.1acc Suspend: 1 Teb: 000007ff`fffd3000 Unfrozen
      

            Assignee:
            spencer@mongodb.com Spencer Brody (Inactive)
            Reporter:
            randolph@mongodb.com Randolph Tan
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated:
              Resolved: