[SERVER-55336] Catalog cache refreshes are not being retried on SnapshotError when going through shard_local Created: 19/Mar/21  Updated: 29/Oct/23  Resolved: 22/Mar/21

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: 4.9.0
Fix Version/s: 4.9.0

Type: Bug Priority: Major - P3
Reporter: Jordi Serra Torrens Assignee: Jordi Serra Torrens
Resolution: Fixed Votes: 0
Labels: PM-1965-Milestone-0-Metadata-Format
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Duplicate
is duplicated by SERVER-55472 Make DBClientCursor::fromAggregationR... Closed
Related
is related to SERVER-55472 Make DBClientCursor::fromAggregationR... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Sprint: Sharding 2021-04-05
Participants:
Linked BF Score: 53

 Description   

CatalogCache will retry if the refresh fails with SnapshotError. However, when the refresh uses shard_local::runAggregation (i.e. when the config server refreshes its own cache), the SnapshotError is not propagated back to the catalog_cache. Instead, the catalog cache sees a CommandFailed error and does not retry. This happens because the SnapshotError gets masked here and converted to CommandFailed.

To address this bug we could change this to instead be:

return getStatusFromCommandResult(ret);



 Comments   
Comment by Githook User [ 22/Mar/21 ]

Author:

{'name': 'Jordi Serra Torrens', 'email': 'jordi.serra-torrens@mongodb.com', 'username': 'jordist'}

Message: SERVER-55336: Catalog cache refreshes are not being retried on SnapshotError when going through shard_local
Branch: master
https://github.com/mongodb/mongo/commit/59843bd3639e9a580bef436124dfaf2bbdb702f4

Generated at Thu Feb 08 05:36:10 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.