[SERVER-48201] ShardRegistry::reload() competes with updates to the ShardRegistry through the RSM Created: 13/May/20  Updated: 27/Oct/23  Resolved: 09/Oct/20

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Matthew Saltz (Inactive) Assignee: Lamont Nelson
Resolution: Gone away Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Related
related to SERVER-48996 Race between isMaster response connec... Closed
is related to SERVER-35252 All config server metadata commands t... Backlog
is related to SERVER-46202 Implement ShardRegistry on top of Rea... Closed
Operating System: ALL
Sprint: Sharding 2020-09-21, Sharding 2020-10-05, Sharding 2020-10-19
Participants:
Linked BF Score: 13

 Description   

After addShard, we call ShardRegistry::reload() on the router so that the following request on the same client will be able to target that shard without receiving a ShardNotFound error. However, with the streamable replica set monitor, calls to onPossibleSet can then overwrite the host data on the ShardRegistry concurrently, leading to a ShardNotFound error on a subsequent request. It doesn't seem like there was ever a guarantee of shard add/remove operations being causally consistent with CRUD ops in any meaningful way, but this breaks tests that used to rely on the shard being available after addShard.



 Comments   
Comment by Lamont Nelson [ 09/Oct/20 ]

This is fixed with SERVER-47359 & SERVER-51257

Comment by Kaloian Manassiev [ 18/Sep/20 ]

I believe that this no longer should be the case (in master) after the changes in SERVER-46202.
CC kevin.pulo.

Generated at Thu Feb 08 05:16:26 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.