-
Type: Bug
-
Resolution: Unresolved
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: Sharding
-
Catalog and Routing
-
ALL
-
CAR Team 2024-11-11, CAR Team 2024-11-25
When a shard needs to target itself using the router role, it retrieves the routing information from the CatalogCache of the Grid to send the request. If this operation occurs within a transaction, we must first yield our resources, send the request over the network (via the ARS), and then unyield.
If the remote request—targeting the shard itself— raises a StaleConfig exception, there are two scenarios to consider:
- Code in Master: The shard role in the "remote" node is responsible for comparing the received version with the wanted version. If the desired version is <0,0>, it triggers a refresh of the filtering metadata, which in turn leads to a refresh of the routing information. This process implicitly triggers the initial router role, which sent this request locally via the network. On the other hand, all other scenarios converge because the shard is the same and shares the same catalog cache. If an operation alters the filtering, it will also change the routing used by the router role. After a retry, this pattern successfully converges.
- Code after SERVER-84243 (dedicate a catalog cache for the shard role and router role): In this case, the shard role in the "remote" node refreshes the filtering without implying a refresh of the routing. Following this, the shard role communicates the StaleConfig back to the sender. Then, the response from the "remote" request is overridden by an unyielding error and is sent up to the router role, which does not refresh because the exception is not classified as a stale error. Consequently, the initial shard role also does not refresh. After retrying, this pattern fails to converge.
The goal of this ticket is to modify the code to ensure that the second scenario converges by invalidating the routing information entry without altering the behavior of the unyielding error.
- is depended on by
-
SERVER-95393 Use a ConfigServerCatalogCacheLoader for the router-role and a ShardServerCatalogCacheLoader for the shard-role
- In Code Review
- is related to
-
SERVER-97256 The router role should be responsible for yielding/unyielding TransactionParticipant resources, rather than the AsyncRequestSender
- Needs Scheduling
- related to
-
SERVER-84243 Dedicate a catalog cache and loader to the shard role
- In Progress