-
Type: Bug
-
Resolution: Duplicate
-
Priority: Major - P3
-
None
-
Affects Version/s: 3.4.2, 3.5.2
-
Component/s: Sharding
-
None
-
ALL
-
Sharding 2017-03-27, Sharding 2017-04-17
-
0
The ShardRegistry::reload call spawns a thread to refresh the list of shards from the config server. Because this thread runs with its own OperationContext, it ends up calling ReplicationCoordinatorImpl::waitUntilOpTimeForRead without any timeout.
Because of this, the shutdown sequence gets stuck since replication cannot make progress and update the opTime due to the server shutting down and the reload operation cannot proceed because it is waiting on the opTime to advance.
The reason for this is that replication is the last entry in the shutdown sequence, so it never gets to be invoked in the scenario above and because of this waitUntilOpTimeForRead becomes permanently stuck.
- duplicates
-
SERVER-27691 ServiceContext::setKillAllOperations should be replaced with an operation that interrupts running operations
- Closed