Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-28840

replSetSyncFrom causes InitialSyncer and ReplicationCoordinator to acquire each other's mutexes in opposite orders

    XMLWordPrintable

    Details

    • Operating System:
      ALL

      Description

      This issue was originally discovered by the Coverity Static Analysis tool.

      Consider the following lock acquisitions in InitialSyncer and ReplicationCoordinatorImpl:

      ReplicationCoordinatorImpl::processReplSetSyncFrom
      1. Acquire ReplicationCoordinatorImpl::_mutex code
      2. Acquire InitialSyncer::_mutex code
      InitialSyncer::_multiApplierCallback
      1. Acquire InitialSyncer::_mutex code
      2. Acquire ReplicationCoordinatorImpl::_mutex code

      Since these two functions acquire the same two locks but in reverse orders, it creates the potential for a deadlock, if each of these functions are running concurrently. One way to fix this would be to stop InitialSyncer from updating the optime of the ReplicationCoordinator on every batch. Alternatively, the _multiApplierCallback could call the _opts.setLastOpTime outside of holding it's own mutex, since it doesn't seem necessary to synchronize access to the InitialSyncer::_lastApplied after it's been written to in that function.

      This issue also occurs in InitialSyncer::_getNextApplierBatchCallback, which acquires the InitialSyncer mutex, and then tries to acquire ReplicationCoordinator's mutex when calling _opts.getSlaveDelay().


      Original Coverity Report Message:

      Defect 100780 (STATIC_C)
      Checker ORDER_REVERSAL (subcategory none)
      File: /src/mongo/db/repl/replication_coordinator_impl.cpp
      Function mongo::repl::ReplicationCoordinatorImpl::processReplSetSyncFrom(mongo::OperationContext *, const mongo::HostAndPort &, mongo::BSONObjBuilder *)

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              backlog-server-repl Backlog - Replication Team
              Reporter:
              xgen-internal-coverity Coverity Collector User
              Participants:
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: