|
After looking at the logs and hang analyzer output, it looks like this is a specific situation that can only happen with a node that isn't required for the commit quorum because the primary is able to write the commitIndexBuild oplog entry before the secondary completes the majority wait before starting the index build. The primary was recently restarted, so it is in the secondary state when the non-voting secondary syncs from it, meaning that the commit point isn't advancing at this time.
[ReplicaSetFixture:job10:primary] 2020-08-28T02:09:37.137+0000 | 2020-08-28T02:09:37.137+00:00 I STORAGE 3856201 [conn40] "Index build: commit quorum satisfied","attr":{"indexBuildEntry":{"_id":{"$uuid":"33679498-1022-419b-8575-f1d5711463d5"},"collectionUUID":{"$uuid":"70d3c4dd-26cc-447b-bf33-7e2cb80a71b4"},"commitQuorum":"votingMembers","indexNames":["point_2d"],"commitReadyMembers":["localhost:22500"]}}
|
...
|
[ReplicaSetFixture:job10:primary] 2020-08-28T02:09:37.139+0000 | 2020-08-28T02:09:37.139+00:00 I INDEX 20447 [conn36] "Index build: completed","attr":{"buildUUID":{"uuid":{"$uuid":"33679498-1022-419b-8575-f1d5711463d5"}}}
|
...
|
[ReplicaSetFixture:job10:primary] 2020-08-28T02:10:12.350+0000 | 2020-08-28T02:10:12.350+00:00 I CONTROL 23377 [SignalHandler] "Received signal","attr":{"signal":15,"error":"Terminated"}
|
...
|
[ReplicaSetFixture:job10:primary] 2020-08-28T02:10:26.718+0000 | 2020-08-28T02:10:26.718+00:00 I REPL 21358 [OplogApplier-0] "Replica set state transition","attr":{"newState":"SECONDARY","oldState":"RECOVERING"}
|
...
|
[ReplicaSetFixture:job10:secondary] 2020-08-28T02:10:27.501+0000 | 2020-08-28T02:10:27.499+00:00 I REPL 21092 [BackgroundSync] "Scheduling fetcher to read remote oplog","attr":{"syncSource":"localhost:22500","lastOpTimeFetched":{"ts":{"$timestamp":{"t":1598580577,"i":4}},"t":1}}
|
...
|
[ReplicaSetFixture:job10:secondary] 2020-08-28T02:10:27.522+0000 | 2020-08-28T02:10:27.521+00:00 I REPL 21832 [ReplCoordExtern-0] "Choosing new sync source. Our current sync source is not primary and does not have a sync source, so we require that it is ahead of us","attr":{"syncSource":"localhost:22500","lastOpTimeFetched":{"ts":{"$timestamp":{"t":1598580612,"i":1}},"t":1},"syncSourceLatestOplogOpTime":{"ts":{"$timestamp":{"t":1598580612,"i":1}},"t":1},"isPrimary":false}
|
...
|
[ReplicaSetFixture:job10:secondary] 2020-08-28T02:10:27.543+0000 | 2020-08-28T02:10:27.533+00:00 D2 REPL_HB 4615620 [ReplCoord-0] "Received response to heartbeat","attr":{"requestId":8,"target":"localhost:22500","response":{"ok":1,"state":2,"v":2,"configTerm":1,"set":"rs","term":1,"durableOpTime":{"ts":{"$timestamp":{"t":1598580612,"i":1}},"t":1},"durableWallTime":{"$date":"2020-08-28T02:10:12.174Z"},"opTime":{"ts":{"$timestamp":{"t":1598580612,"i":1}},"t":1},"wallTime":{"$date":"2020-08-28T02:10:12.174Z"},"$replData":{"term":1,"lastOpCommitted":{"ts":{"$timestamp":{"t":0,"i":0}},"t":-1},"lastCommittedWall":{"$date":"1970-01-01T00:00:00.000Z"},"lastOpVisible":{"ts":{"$timestamp":{"t":0,"i":0}},"t":-1},"configVersion":2,"configTerm":1,"replicaSetId":{"$oid":"5f48675ed265de5f69496b2f"},"syncSourceIndex":-1,"isPrimary":false},"$clusterTime":{"clusterTime":{"$timestamp":{"t":1598580612,"i":1}},"signature":{"hash":{"$binary":{"base64":"AAAAAAAAAAAAAAAAAAAAAAAAAAA=","subType":"0"}},"keyId":0}},"operationTime":{"$timestamp":{"t":1598580612,"i":1}}}}
|
...
|
[ReplicaSetFixture:job10:secondary] 2020-08-28T02:10:27.546+0000 | 2020-08-28T02:10:27.545+00:00 I STORAGE 4847600 [IndexBuildsCoordinatorMongod-0] "Index build: waiting for last optime before interceptors to be majority committed","attr":{"buildUUID":{"uuid":{"$uuid":"33679498-1022-419b-8575-f1d5711463d5"}},"deadline":{"$date":{"$numberLong":"9223372036854775807"}},"timeoutMillis":9223372036854776000,"lastOpTime":{"ts":{"$timestamp":{"t":1598580577,"i":8}},"t":1}}
|
At this point, the former primary runs for election again and wins, causing the term to increment. An automatic reconfig from the stepUp causes the config term to increment. The secondary eventually learns of the new config via heartbeat and tries to persist the new config to disk.
[ReplicaSetFixture:job10:primary] 2020-08-28T02:10:32.328+0000 | 2020-08-28T02:10:32.326+00:00 I ELECTION 21450 [ReplCoord-0] "Election succeeded, assuming primary role","attr":{"term":2}
|
...
|
[ReplicaSetFixture:job10:primary] 2020-08-28T02:10:32.329+0000 | 2020-08-28T02:10:32.328+00:00 I REPL 4508103 [OplogApplier-0] "Increment the config term via reconfig"
|
...
|
[ReplicaSetFixture:job10:secondary] 2020-08-28T02:10:32.333+0000 | 2020-08-28T02:10:32.331+00:00 D1 REPL_HB 4615622 [ReplCoord-0] "Received new config via heartbeat","attr":{"newConfigVersionAndTerm":"{version: 2, term: 2}"}
|
[ReplicaSetFixture:job10:secondary] 2020-08-28T02:10:32.333+0000 | 2020-08-28T02:10:32.332+00:00 D2 REPL_HB 4615626 [ReplCoord-2] "Config validated for reconfig; persisting to disk","attr":{"newConfigVersionAndTerm":"{version: 2, term: 2}"}
|
From the hang analyzer output, we see that the secondary is stuck persisting the new config while trying to get the DBLock. This is because the oplog applier is holding the PBWM in X mode while trying to apply the commitIndexBuild oplog entry (which is stuck waiting for the majority wait to unblock the index build).
Thread 28 (Thread 0x7f5ed5077700 (LWP 62791)):
|
...
|
#12 mongo::Interruptible::waitForConditionOrInterruptFor<std::unique_lock<mongo::latch_detail::Latch>, mongo::CondVarLockGrantNotification::wait(mongo::OperationContext*, mongo::Milliseconds)::<lambda()> > (waitTimer=0x0, pred=..., ms=..., m=..., cv=..., this=0x7f5eff786600) at src/mongo/util/interruptible.h:480
|
#13 mongo::CondVarLockGrantNotification::wait (this=0x7f5efdbf4728, opCtx=0x7f5eff786600, timeout=...) at src/mongo/db/concurrency/lock_state.cpp:249
|
#14 0x00007f5ef13869ee in mongo::LockerImpl::_lockComplete (this=0x7f5efdbf4680, opCtx=0x7f5eff786600, resId=..., mode=mongo::MODE_IS, deadline=...) at src/mongo/db/concurrency/lock_state.cpp:939
|
#15 0x00007f5ef136771b in mongo::Lock::ResourceLock::lock (this=0x7f5ed5075588, opCtx=<optimized out>, mode=<optimized out>, deadline=...) at src/mongo/db/concurrency/d_concurrency.cpp:306
|
#16 0x00007f5ef13678fb in mongo::Lock::GlobalLock::GlobalLock (this=0x7f5ed5075578, opCtx=0x7f5eff786600, lockMode=mongo::MODE_IX, deadline=..., behavior=<optimized out>) at src/mongo/db/concurrency/d_concurrency.cpp:152
|
#17 0x00007f5ef13679eb in mongo::Lock::DBLock::DBLock (this=0x7f5ed5075560, opCtx=0x7f5eff786600, db="local", mode=<optimized out>, deadline=...) at src/mongo/db/concurrency/lock_manager_defs.h:106
|
#18 0x00007f5ef44dc569 in mongo::repl::ReplicationCoordinatorExternalStateImpl::<lambda()>::operator()(void) const (__closure=0x7f5ed50756a0) at src/mongo/base/string_data.h:82
|
#19 0x00007f5ef44dc9b7 in mongo::writeConflictRetry<mongo::repl::ReplicationCoordinatorExternalStateImpl::storeLocalConfigDocument(mongo::OperationContext*, const mongo::BSONObj&, bool)::<lambda()> > (f=..., ns=..., opStr=..., opCtx=0x7f5eff786600) at /opt/mongodbtoolchain/revisions/e5348beb43e147b74a40f4ca5fb05a330ea646cf/stow/gcc-v3.Fno/include/c++/8.2.0/bits/unique_ptr.h:342
|
#20 mongo::repl::ReplicationCoordinatorExternalStateImpl::storeLocalConfigDocument (this=this@entry=0x7f5ef90cb600, opCtx=<optimized out>, config=owned BSONObj 669 bytes @ 0x7f5efdc47d88 - unprintable or invalid, writeOplog=<optimized out>, writeOplog@entry=false) at src/mongo/db/repl/replication_coordinator_external_state_impl.cpp:585
|
#21 0x00007f5ef41ed15b in mongo::repl::ReplicationCoordinatorImpl::_heartbeatReconfigStore (this=0x7f5ef9147800, cbd=..., newConfig=...) at /opt/mongodbtoolchain/revisions/e5348beb43e147b74a40f4ca5fb05a330ea646cf/stow/gcc-v3.Fno/include/c++/8.2.0/bits/unique_ptr.h:342
|
|
Thread 52 (Thread 0x7f5ec4c5d700 (LWP 62832)):
|
...
|
#10 mongo::Interruptible::waitForConditionOrInterrupt<std::unique_lock<mongo::latch_detail::Latch>, mongo::future_details::SharedStateBase::wait(mongo::Interruptible*)::{lambda()#1}>(mongo::stdx::condition_variable&, std::unique_lock<mongo::latch_detail::Latch>&, mongo::future_details::SharedStateBase::wait(mongo::Interruptible*)::{lambda()#1}, mongo::AtomicWord<long>*) (waitTimer=0x0, pred=..., m=..., cv=..., this=0x7f5efe5ee980) at src/mongo/util/interruptible.h:466
|
#11 mongo::future_details::SharedStateBase::wait (this=<optimized out>, interruptible=0x7f5efe5ee980) at src/mongo/util/future_impl.h:402
|
#12 0x00007f5ef1d592a5 in mongo::future_details::SharedStateHolder<mongo::ReplIndexBuildState::IndexCatalogStats>::waitNoThrow (interruptible=0x7f5efe5ee980, this=<synthetic pointer>) at src/mongo/util/future_impl.h:652
|
#13 mongo::SharedSemiFuture<mongo::ReplIndexBuildState::IndexCatalogStats>::waitNoThrow (interruptible=0x7f5efe5ee980, this=<synthetic pointer>) at src/mongo/util/future.h:938
|
#14 mongo::IndexBuildsCoordinator::applyCommitIndexBuild (this=<optimized out>, opCtx=opCtx@entry=0x7f5efe5ee980, oplogEntry=...) at src/mongo/db/index_builds_coordinator.cpp:976
|
Since the replica set now has a primary, we'd expect the commit point to advance and propagate to the secondary, unblocking the index build. However, because the secondary is stuck while applying oplog entries, its last applied is from a previous term, which means that it cannot advance its commit point via heartbeats. The node also doesn't choose a new sync source for the rest of the test (which is the other way we'd propagate the commit point), although I'm still not sure why (I suspect something is making the primary an ineligible sync source).
|