[SERVER-39205] Force the cache refresh after the shard is removed. Created: 25/Jan/19 Updated: 29/Oct/23 Resolved: 19/Feb/19 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | 4.1.9 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Martin Neupauer | Assignee: | Unassigned |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||
| Operating System: | ALL | ||||||||||||||||
| Participants: | |||||||||||||||||
| Linked BF Score: | 12 | ||||||||||||||||
| Comments |
| Comment by Githook User [ 19/Feb/19 ] |
|
Author: {'name': 'Martin Neupauer', 'username': 'MartinNeupauer', 'email': 'martin.neupauer@mongodb.com'}Message: |
| Comment by Esha Maharishi (Inactive) [ 13/Feb/19 ] |
|
The theory charlie.swanson and I came up with last Friday for why the test only fails sometimes without theĀ flushRouterConfig is that the $out, which is run in a parallel shell, races with the currentOp. If theĀ $out attempts to establish the cursor on the removed shard before the currentOp forces a refresh of the ShardRegistry, the $out succeeds in targeting the removed shard and gets back a StaleShardVersion from it. The StaleShardVersion causes the router's CatalogCache to be refreshed and the $out to be retried, and the $out retry with the fresh cache succeeds (and the test passes). If the currentOp forces the ShardRegistry refresh first, then the $out gets ShardNotFound on its first attempt to establish the cursor, and the $out fails (and test fails). We should confirm this theory by putting a several-second sleep just before the establishCursors line and verifying that the test always fails without the flushRouterConfig but always passes with the flushRouterConfig. We can then add the flushRouterConfig to the test, preferably with a comment explaining the race induced by the test. |