[SERVER-16232] Active and Agressive balancing of small databases Created: 19/Nov/14  Updated: 10/Dec/14  Resolved: 20/Nov/14

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: John Page Assignee: Unassigned
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
Participants:

 Description   

If you start with a sharded cluster - don't pre split manually, and then pour in data as fast as you can the balancer struggles to keep up and you end up pouring all data into one shard.

Whilst chunk migration is deliberately slow normally we should change the way chunk splitting happens early on to get all shards into play at the first possible opportunity, possibly by splitting an migrating 'tiny' chunks - at 256K rather than 64MB - growing the chunk size later once we have all shards in play

We should also optionally prioritise chunk migration over insertions where we have a very large imbalance in chunk numbers for example many empty shards.



 Comments   
Comment by John Page [ 20/Nov/14 ]

The behaviour I see is that currently whatever we are doing doesn't work - if you hit an empty system hard (even with random shard key values) balancing is not fast enough to keep the system balanced or even close.

if we do the split-ahead part that Asya says we do - why do I not see writes moving to another shard - it's not just about splitting it's about splitting AND relocating that empty chunk, if this worked as i described a monotonic shard key would write to all shards round-robin.

Comment by Greg Studer [ 20/Nov/14 ]

> normally we should change the way chunk splitting happens early on to get all shards into play at the first possible opportunity, possibly by splitting an migrating 'tiny' chunks - at 256K rather than 64MB

This is also the current autosplit behavior - though there have been bugs in v2.7 in particular. Our autosplit threshold when < 20 chunks exist for a particular collection is much smaller than the actual chunk size.

Generated at Thu Feb 08 03:40:23 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.