[SERVER-26078] Top-chunk auto-split unnecessarily moves chunks Created: 12/Sep/16 Updated: 06/Dec/22 Resolved: 02/Jan/20 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Sharding |
| Affects Version/s: | 3.0.12, 3.2.9, 3.3.12 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Kaloian Manassiev | Assignee: | [DO NOT USE] Backlog - Sharding Team |
| Resolution: | Won't Fix | Votes: | 0 |
| Labels: | ShardingRoughEdges | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Assigned Teams: |
Sharding
|
| Operating System: | ALL |
| Participants: |
| Description |
|
The top-chunk auto-split optimization does not consult the global chunk distribution state and might cause chunks to move unnecessarily. Consider the case where there are 4 shards with the following assignment of chunks:
If inserts are happening to a chunk on shard 2 and these trigger top-chunk auto-split with suggested move, a chunk from shard 2 will be moved off to 3 or 4, even though this is completely unnecessary given the overall distribution. The problem is slightly exacerbated in 3.4 with the support for parallel migrations, because there is higher chance that in this case the second migration will actually proceed, versus in prior versions where the migration will most likely fail due to the restriction of single migration per collection. |
| Comments |
| Comment by Sheeri Cabral (Inactive) [ 02/Jan/20 ] |
|
This has not been flagged as an issue that causes extreme wasted resources and it's been an issue since at least 3.4 and earlier. We can re-open if we find out it's a problem. |
| Comment by Kaloian Manassiev [ 13/Aug/18 ] |
|
matthew.saltz, the problem here is not with the auto-splitter, but with the balancer itself. While both implementations of the auto-splitter do invoke the balancer, they end-up invoking this code, which will move a chunk regardless of it is necessary given the overall distribution. I am reopening the ticket and booting it out of the auto-splitter on shards project. It is a fairly simple fix - most likely we need to make Balancer::rebalanceSingleChunk to return an error code "RebalanceNotNeeded" or just return OK without actually moving a chunk if this is not necessary (like in the example I gave in the description). |
| Comment by Matthew Saltz (Inactive) [ 08/Aug/18 ] |
|
It appears that now we already use the balancer to decide which shard to move to, rather than picking a specific shard: https://github.com/mongodb/mongo/blob/master/src/mongo/s/write_ops/cluster_write.cpp#L402 And will continue to do so once we move autosplitting to the shard. |