[SERVER-61127] Multi-writes may exhaust the number of retry attempts in the presence of ongoing chunk migrations Created: 30/Oct/21 Updated: 29/Oct/23 Resolved: 09/Jun/22 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Sharding |
| Affects Version/s: | None |
| Fix Version/s: | 6.1.0-rc0, 6.0.8 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Kaloian Manassiev | Assignee: | Jordi Serra Torrens |
| Resolution: | Fixed | Votes: | 1 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||
| Operating System: | ALL | ||||||||
| Backport Requested: |
v6.0
|
||||||||
| Sprint: | Sharding EMEA 2022-01-24, Sharding EMEA 2022-02-07, Sharding EMEA 2022-02-21, Sharding EMEA 2022-03-07, Sharding EMEA 2022-03-21, Sharding EMEA 2022-05-02, Sharding EMEA 2022-05-16, Sharding EMEA 2022-05-30, Sharding EMEA 2022-06-13 | ||||||||
| Participants: | |||||||||
| Linked BF Score: | 1 | ||||||||
| Description |
|
Multi-writes in a sharded cluster (updateMany:true and justOne:false for deletes) do not perform version checking on account that they are broadcast to all nodes in the sharded cluster. Such operations attach the special value ChunkVersion::IGNORED to indicate that an operation is coming from a router (as opposed to direct connection to a shard), but that the shard must not perform version checking, under the assumption that the caller knows what they are doing. However, ChunkVersion::IGNORED still triggers a StaleShardVersion exception in the case where the shardVersion is UNKNOWN or if the shard is in a critical section. The former is not a big problem, since it only happens once for the duration of a shard's MongoD process, but the latter is problematic since it may exhaust the 10 retry attempts that we allow on the router. This ticket is to come-up with a scheme so that multi-writes' StaleShardVersion exceptions be retried at the level of the shard and not bubble up all the way up to the router. |
| Comments |
| Comment by Githook User [ 29/Jun/23 ] |
|
Author: {'name': 'Jordi Serra Torrens', 'email': 'jordi.serra-torrens@mongodb.com', 'username': 'jordist'}Message: (cherry picked from commit 824b9b7e608687ba0db7af2d5ccc5b6811a46720) |
| Comment by Githook User [ 09/Jun/22 ] |
|
Author: {'name': 'Jordi Serra Torrens', 'email': 'jordi.serra-torrens@mongodb.com', 'username': 'jordist'}Message: |
| Comment by Cris Insignares Cuello [ 02/Mar/22 ] |
|
na |