-
Type:
Bug
-
Resolution: Fixed
-
Priority:
Minor - P4
-
Affects Version/s: None
-
Component/s: None
-
None
-
Fully Compatible
-
Repl 2020-01-13, Repl 2020-01-27
-
None
-
3
-
None
-
None
-
None
-
None
-
None
-
None
As a result of analysis from our hierarchical locking project, we have found that there appears to be a circular dependency between ReplicationCoordinatorImpl and InitialSyncer. It appears that ReplicationCoordinatorImpl can call InitialSyncer::isActive() to check state, but otherwise the InitialSyncer is expected to have control over the components of the ReplicationCoordinatorImpl. This can be trivially resolved by copying _initialSyncer to the stack under lock here and invoking _initialSyncer->isActive() out of lock. I don't think this is an especially worrisome cycle.
These two stacks show the two underlying mutexes taken in opposing orders:
"InitialSyncer has priority"
... /home/ben/git/mongodb/mongo/worktrees/ben_idfl/src/mongo/util/assert_util.cpp:169:15: mongo::fassertFailedWithStatusWithLocation(int, mongo::Status const&, char const*, unsigned int) /home/ben/git/mongodb/mongo/worktrees/ben_idfl/src/mongo/util/assert_util.h:289:44: mongo::fassertWithLocation(int, mongo::Status const&, char const*, unsigned int) /home/ben/git/mongodb/mongo/worktrees/ben_idfl/src/mongo/util/latch_analyzer.cpp:198:13: mongo::LatchAnalyzer::onAcquire(mongo::latch_detail::Identity const&) (.cold.925) /home/ben/git/mongodb/mongo/worktrees/ben_idfl/src/mongo/platform/mutex.cpp:98:30: mongo::Mutex::_onQuickLock() /opt/mongodbtoolchain/revisions/94dac13bc8c0b50beff286acac77adeb2e81761e/stow/gcc-v3.p5v/include/c++/8.2.0/bits/std_mutex.h:162:9: std::lock_guard<mongo::Latch>::lock_guard(mongo::Latch&) /home/ben/git/mongodb/mongo/worktrees/ben_idfl/src/mongo/db/repl/replication_coordinator_impl.cpp:1237:40: mongo::repl::ReplicationCoordinatorImpl::getMyLastAppliedOpTimeAndWallTime() const /home/ben/git/mongodb/mongo/worktrees/ben_idfl/src/mongo/db/repl/replication_coordinator_external_state_impl.cpp:958:91: mongo::repl::ReplicationCoordinatorExternalStateImpl::getToken() /home/ben/git/mongodb/mongo/worktrees/ben_idfl/src/mongo/db/storage/wiredtiger/wiredtiger_session_cache.cpp:303:63: mongo::WiredTigerSessionCache::waitUntilDurable(mongo::OperationContext*, bool, bool) /home/ben/git/mongodb/mongo/worktrees/ben_idfl/src/mongo/db/storage/wiredtiger/wiredtiger_recovery_unit.cpp:252:36: mongo::WiredTigerRecoveryUnit::waitUntilDurable(mongo::OperationContext*) /home/ben/git/mongodb/mongo/worktrees/ben_idfl/src/mongo/db/repl/replication_consistency_markers_impl.cpp:142:44: mongo::repl::ReplicationConsistencyMarkersImpl::setInitialSyncFlag(mongo::OperationContext*) /home/ben/git/mongodb/mongo/worktrees/ben_idfl/src/mongo/db/repl/initial_syncer.cpp:420:69: mongo::repl::InitialSyncer::_setUp_inlock(mongo::OperationContext*, unsigned int) /home/ben/git/mongodb/mongo/worktrees/ben_idfl/src/mongo/db/repl/initial_syncer.cpp:251:18: mongo::repl::InitialSyncer::startup(mongo::OperationContext*, unsigned int) /home/ben/git/mongodb/mongo/worktrees/ben_idfl/src/mongo/db/repl/replication_coordinator_impl.cpp:741:9: mongo::repl::ReplicationCoordinatorImpl::_startDataReplication(mongo::OperationContext*, std::function<void ()>) /home/ben/git/mongodb/mongo/worktrees/ben_idfl/src/mongo/db/repl/replication_coordinator_impl_heartbeat.cpp:566:30: mongo::repl::ReplicationCoordinatorImpl::_heartbeatReconfigStore(mongo::executor::TaskExecutor::CallbackArgs const&, mongo::repl::ReplSetConfig const&) ...
"ReplicationCoordinatorImpl has priority"
... /home/ben/git/mongodb/mongo/worktrees/ben_idfl/src/mongo/util/assert_util.cpp:169:15: mongo::fassertFailedWithStatusWithLocation(int, mongo::Status const&, char const*, unsigned int) /home/ben/git/mongodb/mongo/worktrees/ben_idfl/src/mongo/util/assert_util.h:289:44: mongo::fassertWithLocation(int, mongo::Status const&, char const*, unsigned int) /home/ben/git/mongodb/mongo/worktrees/ben_idfl/src/mongo/util/latch_analyzer.cpp:198:13: mongo::LatchAnalyzer::onAcquire(mongo::latch_detail::Identity const&) (.cold.925) /home/ben/git/mongodb/mongo/worktrees/ben_idfl/src/mongo/platform/mutex.cpp:98:30: mongo::Mutex::_onQuickLock() /opt/mongodbtoolchain/revisions/94dac13bc8c0b50beff286acac77adeb2e81761e/stow/gcc-v3.p5v/include/c++/8.2.0/bits/std_mutex.h:162:9: std::lock_guard<mongo::Latch>::lock_guard(mongo::Latch&) /home/ben/git/mongodb/mongo/worktrees/ben_idfl/src/mongo/db/repl/initial_syncer.cpp:225:40: mongo::repl::InitialSyncer::isActive() const /home/ben/git/mongodb/mongo/worktrees/ben_idfl/src/mongo/db/repl/replication_coordinator_impl.cpp:2570:79: mongo::repl::ReplicationCoordinatorImpl::processReplSetSyncFrom(mongo::OperationContext*, mongo::HostAndPort const&, mongo::BSONObjBuilder*) /home/ben/git/mongodb/mongo/worktrees/ben_idfl/src/mongo/db/repl/repl_set_commands.cpp:591:9: mongo::repl::CmdReplSetSyncFrom::run(mongo::OperationContext*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, mongo::BSONObj const&, mongo::BSONObjBuilder&) /home/ben/git/mongodb/mongo/worktrees/ben_idfl/src/mongo/db/commands.cpp:629:32: mongo::BasicCommand::Invocation::run(mongo::OperationContext*, mongo::rpc::ReplyBuilderInterface*) ...