-
Type: Bug
-
Resolution: Gone away
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: Replication
-
Replication
-
ALL
This issue was originally discovered by the Coverity Static Analysis tool.
Consider the following lock acquisitions in InitialSyncer and ReplicationCoordinatorImpl:
ReplicationCoordinatorImpl::processReplSetSyncFrom
InitialSyncer::_multiApplierCallback
Since these two functions acquire the same two locks but in reverse orders, it creates the potential for a deadlock, if each of these functions are running concurrently. One way to fix this would be to stop InitialSyncer from updating the optime of the ReplicationCoordinator on every batch. Alternatively, the _multiApplierCallback could call the _opts.setLastOpTime outside of holding it's own mutex, since it doesn't seem necessary to synchronize access to the InitialSyncer::_lastApplied after it's been written to in that function.
This issue also occurs in InitialSyncer::_getNextApplierBatchCallback, which acquires the InitialSyncer mutex, and then tries to acquire ReplicationCoordinator's mutex when calling _opts.getSlaveDelay().
Original Coverity Report Message:
Defect 100780 (STATIC_C)
Checker ORDER_REVERSAL (subcategory none)
File: /src/mongo/db/repl/replication_coordinator_impl.cpp
Function mongo::repl::ReplicationCoordinatorImpl::processReplSetSyncFrom(mongo::OperationContext *, const mongo::HostAndPort &, mongo::BSONObjBuilder *)
- is duplicated by
-
SERVER-28859 Coverity analysis defect 101487: Thread deadlock
- Closed
-
SERVER-28886 Coverity analysis defect 101486: Thread deadlock
- Closed
- is related to
-
SERVER-34758 replSetGetStatus can deadlock with initialSyncer
- Closed
-
SERVER-35372 replSetSyncFrom can cause deadlock between ReplicationCoordinator and InitialSyncer
- Closed
- related to
-
SERVER-31487 Replace replSetSyncFrom resync option with initialSyncSource server parameter
- Closed