-
Type:
Bug
-
Resolution: Unresolved
-
Priority:
Major - P3
-
None
-
Affects Version/s: 6.0.3, 7.0.0, 8.0.0, 8.2.0
-
Component/s: None
-
None
-
Catalog and Routing
-
ALL
-
-
CAR Team 2026-01-05
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Summary
The balancer does not make progress in certain scenarios where the most loaded shard belongs to a balanced zone, because it keeps selecting that shard as donor even when all shards in its zone are already balanced, and then fails to find a suitable recipient since the remaining underloaded shards belong to different zones.
Details
When the cluster has zones configured and the most overloaded shard (by data size) is in a zone that is already internally balanced, the balancer repeatedly tries to move chunks from that shard.
However, since the other shards in the same zone are already balanced, there are no valid chunk candidates that can be donated while still honoring the existing zone configuration. As a result:
- The balancer keeps choosing the most loaded shard as the donor
- No migrations are actually performed, so the overall balancing does not make progress
- Zones themselves are respected at all times; the issue is with donor selection and progress when the top candidate shard cannot actually donate any chunks
Impact
Balancer rounds can appear to be “stuck” or not making progress, even though the system is correctly enforcing the configured zones.
This mainly affects situations where:
- One shard is globally the most loaded shard
- That shard is in a zone that is already locally balanced
- Other zones may remain unbalanced
Expected Behavior
If the most loaded shard in a zone cannot donate any further chunks without violating zone constraints, the balancer should:
- Skip it as a donor candidate for that round, and
- Consider other shards/zones where valid migrations would still respect the zone configuration and effectively reduce imbalance.