Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-91875

Lock order inversion between FileCopyBasedInitialSyncer::_mutex and ReplicationCoordinatorImpl::_mutex

    • Type: Icon: Bug Bug
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 8.1.0-rc0
    • Affects Version/s: None
    • Component/s: None
    • None
    • Replication
    • Fully Compatible
    • ALL
    • Repl 2024-07-08, Repl 2024-07-22, Repl 2024-08-05

      FileCopyBasedInitialSyncer::_createOplogIfNeeded can be called while the FCBIS mutex is held. In turn, this function can acquire the ReplicationCoordinatorImpl mutex as part of calling getMemberState when creating the oplog collection.

      ReplicationCoordinatorImpl::_startInitialSync can be called while holding the RCI mutex, and can result in destroying a FCBIS and calling it's shutdown method, which will acquire the FCBIS mutex.

      This is an inversion and could result in a deadlock without external synchronization. It's possible that, if these code paths are only exercisable on initial sync startup and shutdown, that external synchronization is provided. Nonetheless, TSAN can't see that synchronization, so it requires a suppression. We should confirm if this is a real potential deadlock or not.

            Assignee:
            ali.mir@mongodb.com Ali Mir
            Reporter:
            george.wangensteen@mongodb.com George Wangensteen (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            10 Start watching this issue

              Created:
              Updated:
              Resolved: