[SERVER-53444] Make tests that run removeShard in assert.soon to wait for the state to become "completed" not error on ShardNotFound Created: 18/Dec/20 Updated: 29/Oct/23 Resolved: 21/Dec/20 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Sharding |
| Affects Version/s: | None |
| Fix Version/s: | 4.9.0, 4.2.12, 4.4.4 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Cheahuychou Mao | Assignee: | Cheahuychou Mao |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | sharding-csrs-stepdown-upkeep | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||
| Operating System: | ALL | ||||||||
| Backport Requested: |
v4.4, v4.2
|
||||||||
| Sprint: | Sharding 2020-12-28 | ||||||||
| Participants: | |||||||||
| Linked BF Score: | 50 | ||||||||
| Description |
|
If the config server primary steps down right after removing the config.shards doc for the shard but before responding with "state": "completed", the mongos would retry the _configsvrRemoveShard command against the new config server primary, which would not find the removed shard in its ShardRegistry if it has done a ShardRegistry reload after the config.shards doc for the shard was removed. This would cause the command to fail with ShardNotFound. |
| Comments |
| Comment by Ian Whalen (Inactive) [ 04/Jan/21 ] |
|
Author: {'username': u'evrg-bot-webhook', 'name': u'Cheahuychou Mao', 'email': u'mao.cheahuychou@gmail.com'}Message: (cherry picked from commit 03637b5614c1a29983cdac9a1f9ab2d3f7060f15) |
| Comment by Ian Whalen (Inactive) [ 04/Jan/21 ] |
|
Author: {'username': u'evrg-bot-webhook', 'name': u'Cheahuychou Mao', 'email': u'mao.cheahuychou@gmail.com'}Message: (cherry picked from commit 03637b5614c1a29983cdac9a1f9ab2d3f7060f15) |
| Comment by Githook User [ 21/Dec/20 ] |
|
Author: {'name': 'Cheahuychou Mao', 'email': 'mao.cheahuychou@gmail.com', 'username': 'cheahuychou'}Message: |