The data size aware balancing introduced in v6.0.3+ versions can potentially schedule migrations between shards that are above the optimal data size threshold.
The chunk-based balancing was avoiding that by only scheduling migrations to shards with a sufficiently low number of chunks.
Example
There are 400GB to spread across 4 shards (ideally 100GB per shard) currently distributed in the following way:
- Shard0: 130GB
- Shard1: 120GB
- Shard2: 110GB
- Shard3: 40GB
Next balancing round (currently):
- Shard0 donates to shard3
- Shard1 donates to shard2 (this is the bug -> there is no real need to donate from shard1 to shard2 because they both need to reach 100GB)
Expected next balancing round after fixing this:
- Shard0 donates to shard3
(Problem/solution very similar to SERVER-75481)
- is caused by
-
SERVER-65816 Change balancer policy to balance on data size rather than number of chunks
- Closed