-
Type:
Bug
-
Resolution: Unresolved
-
Priority:
Major - P3
-
None
-
Affects Version/s: 8.0.12, 8.2.0
-
Component/s: None
-
None
-
Catalog and Routing
-
ALL
-
3
-
🟩 Routing and Topology
-
None
-
None
-
None
-
None
-
None
-
None
When all hosts in a shard are changed, a node's ShardRegistry might be unable to learn the new connection string and thus the node won't be able to communicate to the shard.
This can only happen when the complete replica set reconfig happens during a period of time where the node is unable to communicate to any nodes of the replica.
Details:
The ShardRegistry keeps track of the shards in the cluster and their connection strings. Since SERVER-91121, the ShardRegistry only refreshes from the configsvr when it detects that the topologyTime has changed. When a replicaSet is reconfigured, the shard will update the `hosts` attribute of the corresponding `config.shards` entry (see SERVER-21185). However, this does not advance the `topologyTime`, and so ShardRegistries don't learn through their periodic refreshes. However, typically ShardRegistries learn about the reconfig through a different mechanism: The ReplicaSetMonitor notifies it when it learns about the reconfig from a replica it already knew.
However, if the node is unable to communicate to any known replica, and then they are all replaced, the node won't be able to learn about the new nodes until it restarts.
- is related to
-
SERVER-110329 Add command to force a ShardRegistry refresh
-
- Backlog
-
-
SERVER-21185 Make shard primary responsible for updating config server's knowledge of shard replica set members
-
- Closed
-