-
Type: Task
-
Resolution: Fixed
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
None
-
Fully Compatible
-
3
Reproduced with jstests/sharding/delete_range_deletion_tasks_on_stepup_after_drop_collection.js.
The test is doing step down and up on catalog shard, while doing a chunk migration that is supposed to fail. After step up, the former primary becomes primary again, and ReplicationCoordinatorImpl::signalDrainComplete() is invoked and never completes until the test ends.
The side-effect of this is that the _makeHelloResponse() will always return "i am secondary", which makes the Hello reply consumer to drop it.
There is a logical deadlock during the chunk migration logic resuming on step up:
1. ReplicationCoordinatorExternalStateImpl::onTransitionToPrimary
2. ReplicationCoordinatorExternalStateImpl::_shardingOnTransitionToPrimaryHook()
3. ShardingStateRecovery::recover()
4. // Need to fetch the latest uptime from the config server, so do a logging write
ShardingLogging::get(opCtx)->logChangeChecked(..., kMajorityWriteConcern)
at this point it should be clear that majority write concern during step up before the writes are allowed will not work...
5. ShardingLogging::_log()
6. Grid::get(opCtx) ->catalogClient()->insertConfigDocument(..., kMajorityWriteConcern)
7. ShardLocal::_runCommand()
My opinion the step 4 to write the recoveryDoc and fetch the latest uptime cannot use the majority. Just do the local write if you are a primary.