Currently the ARS uses the non-causally consistent ShardRegistry::getShardNoReload() function to retrieve the target shard.
This function doesn't guarantee that:
- the ShardRegistry has been properly initialized
- the ShardRegistry contains the information related to the last gossiped topology time
The ARS should use the causally consistent counter-part function ShardRegistry::getShard().
A first attempt to fix this showed that is not feasible to call this blocking function using the internally stored operation context. As alternative solution we could provide a async version of the ShardRegistry::getShard that will be used in the ARS future-chain that implements the remote request scheduling.
- related to
SERVER-60916 CPS Restores failed with a snapshot with documents in reshardingOperation
SERVER-61003 ReadConcernMajorityNotAvailableYet errors from ShardRegistry must be retried