Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-20021

Deadlock between SyncSourceFeedback and ReplicationCoordinator mutexes

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • 3.1.7
    • Affects Version/s: None
    • Component/s: Replication
    • Labels:
      None
    • Fully Compatible
    • ALL
    • RPL 8 08/31/15

      Hit this in a patch build.
      task
      full logs

      Relevant part of the logs:
      One thread:

       [2015/08/18 18:41:23.795] Thread 3126 (Thread 0x2af323cfd940 (LWP 6630)):
       [2015/08/18 18:41:23.795] #0  0x00002af319388654 in __lll_lock_wait () from /lib64/libpthread.so.0
       [2015/08/18 18:41:23.795] #1  0x00002af319383f4a in _L_lock_1034 () from /lib64/libpthread.so.0
       [2015/08/18 18:41:23.795] #2  0x00002af319383e0c in pthread_mutex_lock () from /lib64/libpthread.so.0
       [2015/08/18 18:41:23.796] #3  0x0000000000e515ce in mongo::repl::SyncSourceFeedback::setKeepAliveInterval(std::chrono::duration<long, std::ratio<1l, 1000l> >) ()
       [2015/08/18 18:41:23.796] #4  0x0000000000e12466 in mongo::repl::ReplicationCoordinatorImpl::_setCurrentRSConfig_inlock(mongo::repl::ReplicaSetConfig const&, int) ()
       [2015/08/18 18:41:23.796] #5  0x0000000000e1fda1 in mongo::repl::ReplicationCoordinatorImpl::_heartbeatReconfigFinish(mongo::executor::TaskExecutor::CallbackArgs const&, mongo::repl::ReplicaSetConfig const&, mongo::StatusWith<int>) ()
       [2015/08/18 18:41:23.796] #6  0x0000000000e2401d in std::_Function_handler<void (mongo::executor::TaskExecutor::CallbackArgs const&), std::_Bind<std::_Mem_fn<void (mongo::repl::ReplicationCoordinatorImpl::*)(mongo::executor::TaskExecutor::CallbackArgs const&, mongo::repl::ReplicaSetConfig const&, mongo::StatusWith<int>)> (mongo::repl::ReplicationCoordinatorImpl*, std::_Placeholder<1>, mongo::repl::ReplicaSetConfig, mongo::StatusWith<int>)> >::_M_invoke(std::_Any_data const&, mongo::executor::TaskExecutor::CallbackArgs const&) ()
       [2015/08/18 18:41:23.796] #7  0x0000000000e259d9 in mongo::repl::(anonymous namespace)::callNoExcept(std::function<void ()> const&) ()
       [2015/08/18 18:41:23.796] #8  0x0000000000e2ab20 in mongo::repl::ReplicationExecutor::run() ()
       [2015/08/18 18:41:23.796] #9  0x0000000001924120 in execute_native_thread_routine ()
       [2015/08/18 18:41:23.796] #10 0x00002af31938183d in start_thread () from /lib64/libpthread.so.0
       [2015/08/18 18:41:23.796] #11 0x00002af31966cfcd in clone () from /lib64/libc.so.6
       [2015/08/18 18:41:23.796] #12 0x0000000000000000 in ?? ()
      

      Another thread:

       [2015/08/18 18:41:23.809] Thread 3114 (Thread 0x2af32be3d940 (LWP 6687)):
       [2015/08/18 18:41:23.809] #0  0x00002af319388654 in __lll_lock_wait () from /lib64/libpthread.so.0
       [2015/08/18 18:41:23.809] #1  0x00002af319383f4a in _L_lock_1034 () from /lib64/libpthread.so.0
       [2015/08/18 18:41:23.809] #2  0x00002af319383e0c in pthread_mutex_lock () from /lib64/libpthread.so.0
       [2015/08/18 18:41:23.809] #3  0x000000000095f283 in std::mutex::lock() ()
       [2015/08/18 18:41:23.809] #4  0x0000000000e035a7 in mongo::repl::ReplicationCoordinatorImpl::prepareReplSetUpdatePositionCommand(mongo::BSONObjBuilder*) ()
       [2015/08/18 18:41:23.809] #5  0x0000000000e516bb in mongo::repl::SyncSourceFeedback::updateUpstream(mongo::OperationContext*) ()
       [2015/08/18 18:41:23.809] #6  0x0000000000e521fb in mongo::repl::SyncSourceFeedback::run() ()
       [2015/08/18 18:41:23.810] #7  0x0000000001924120 in execute_native_thread_routine ()
       [2015/08/18 18:41:23.810] #8  0x00002af31938183d in start_thread () from /lib64/libpthread.so.0
       [2015/08/18 18:41:23.810] #9  0x00002af31966cfcd in clone () from /lib64/libc.so.6
       [2015/08/18 18:41:23.810] #10 0x0000000000000000 in ?? ()
      

            Assignee:
            milkie@mongodb.com Eric Milkie
            Reporter:
            spencer@mongodb.com Spencer Brody (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated:
              Resolved: