[SERVER-10024] cluster can end up with large chunks that did not get split and will time out on migration Created: 25/Jun/13 Updated: 14/Apr/23 Resolved: 14/Apr/23 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Sharding |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Antoine Girbal | Assignee: | Pierlauro Sciarelli |
| Resolution: | Done | Votes: | 1 |
| Labels: | sharding-wfbf-day | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||
| Assigned Teams: |
Sharding EMEA
|
||||||||||||
| Sprint: | Sharding EMEA 2023-04-17 | ||||||||||||
| Participants: | |||||||||||||
| Description |
|
Consider the case where:
I think we need several server improvements: A. any chunk migration abort due to timeout should result in a split. If anything the split wont hurt. Right now the split seems to be for a specific case only. B. ideally the migration process would avoid retrying the same chunk over and over. May need some amount of randomization on candidate chunks. C. when mongos fails to split due to NS locked, it should mark the metadata as "needs split" for later. Ideally all "need split" should be cleared before the next migration is attempted. This is all to avoid the bad catch 22 problems where large chunks end up clogging the whole system. |
| Comments |
| Comment by Pierlauro Sciarelli [ 14/Apr/23 ] |
|
Closing this ticket as gone away because:
As a side note, the auto-splitter has gone away starting from v6.0 so the pre-splitting solution proposed by Nic is not viable anymore. Quoting 6.0 release notes:
|
| Comment by Nic Cottrell [ 27/Jan/20 ] |
|
In regards to problems with slow migration during special insert workloads, one good solution is to manually split chunks prior to starting the workload and allowing the balancer to re-balance the new empty chunks. When you start the import workload, the inserts should then be distributed across all available shards rather than all being send to a single "hot shard" and then migrated in a subsequent step. |
| Comment by Kevin J. Rice [ 06/Aug/13 ] |
|
I'm seeing this behaviour when doing mongorestore of a large database. I end up with a bunch of unbalanced shards (we have 48 shards) that are not splitting because mongorestore is eating all the IO. So, balancer isn't going, splitter sometimes fails, etc. I end up having to stop the mongorestore, restart daemons/mongos processes, stopBalancer()/startBalancer()/setBalancerState(false)-wait-2-minutes-setBalancerState(true), etc., etc., until the balancer decides to start working, wait for it to balance, then start the mongorestore again with a properly split and balanced set of data. |