[SERVER-20140] shard balancer fails to split chunks with more than 250000 docs Created: 26/Aug/15 Updated: 28/Aug/15 Resolved: 26/Aug/15 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Sharding |
| Affects Version/s: | 3.0.4 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Derek Wilson | Assignee: | Unassigned |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Operating System: | ALL | ||||||||
| Steps To Reproduce: | 1) create collection sharded on {_id: 1} 2) turn off balancer 2) insert ~100M small docs like {_id: "text", "c": 12345} (this is a colleciton of counts of strings if you care to know the real world use case) 3) turn on balancer and wait til things stop moving 4) turn balancer off 5) manually find all jumbo chunks and run sh.splitFind() on them 6) go back to 3 forever (or at least it feels like it) |
||||||||
| Participants: | |||||||||
| Description |
|
is it possible there was a regression re-surfaced any of these issues?
i'm seeing very very similar behavior: mongodb 3.0.4 sharded collections with 64MB chunks using wiredtiger one colleciton with documents that average 2kB in size the collection with 2kB size docs is even distributed running the balancer does not seem to automatically split chunks - just marks them as jumbo. i can run pass after pass of sh.splitFind on each chunk until there are no jumbo chunks left and then more things get balanced. except then when i run the balancer again i get more chunks marked as jumbo and then i need to do splits again. basically to get the cluster evenly distributed after an initial load i have to alternate splitting and balancing for days. |
| Comments |
| Comment by Derek Wilson [ 28/Aug/15 ] |
|
Thanks... but that work around doesn't work for my use case |
| Comment by Ramon Fernandez Marina [ 28/Aug/15 ] |
|
underrun, this is to let you know that we've posted a workaround for this issue in the "Description" section of Regards, |
| Comment by Ramon Fernandez Marina [ 26/Aug/15 ] |
|
Thanks for your report underrun. This bug was reported earlier in Thanks, |
| Comment by Derek Wilson [ 26/Aug/15 ] |
|
right - i forgot to mention that because i have collections with very large documents and collections with very small documents, making the chunk size small enough to avoid this issue will mean i have way too few documents per chunk with my larger document collections - for instance i would need to make chunk size about 9.5MB to keep a collection with average doc size of 40B under 250000 docs. But then chunks of collections with a 4k average doc size will only have about 2.4k docs per chunk. and with hundreds of millions of documents that means dozens (to hundreds) of thousands of chunks to manage which could start to cause problems on the other end of the spectrum. |