[SERVER-20140] shard balancer fails to split chunks with more than 250000 docs Created: 26/Aug/15  Updated: 28/Aug/15  Resolved: 26/Aug/15

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: 3.0.4
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Derek Wilson Assignee: Unassigned
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
duplicates SERVER-19919 Chunks that exceed 250000 docs but ar... Closed
Operating System: ALL
Steps To Reproduce:

1) create collection sharded on {_id: 1}

2) turn off balancer

2) insert ~100M small docs like {_id: "text", "c": 12345} (this is a colleciton of counts of strings if you care to know the real world use case)

3) turn on balancer and wait til things stop moving

4) turn balancer off

5) manually find all jumbo chunks and run sh.splitFind() on them

6) go back to 3 forever (or at least it feels like it)

Participants:

 Description   

is it possible there was a regression re-surfaced any of these issues?

SERVER-9365 SERVER-9498 SERVER-9690 SERVER-9792 SERVER-10271

i'm seeing very very similar behavior:

mongodb 3.0.4 sharded collections with 64MB chunks using wiredtiger

one colleciton with documents that average 2kB in size
one collection with documents that average 40B bytes in size

the collection with 2kB size docs is even distributed
the collection with 40B size docs is nearly entirely jumbo chunks

running the balancer does not seem to automatically split chunks - just marks them as jumbo.

i can run pass after pass of sh.splitFind on each chunk until there are no jumbo chunks left and then more things get balanced.

except then when i run the balancer again i get more chunks marked as jumbo and then i need to do splits again.

basically to get the cluster evenly distributed after an initial load i have to alternate splitting and balancing for days.



 Comments   
Comment by Derek Wilson [ 28/Aug/15 ]

Thanks... but that work around doesn't work for my use case

Comment by Ramon Fernandez Marina [ 28/Aug/15 ]

underrun, this is to let you know that we've posted a workaround for this issue in the "Description" section of SERVER-19919.

Regards,
Ramón.

Comment by Ramon Fernandez Marina [ 26/Aug/15 ]

Thanks for your report underrun. This bug was reported earlier in SERVER-19919, so I'm going to mark this ticket as a duplicate. Please watch SERVER-19919 for updates, we're investigating a workaround until this issue is fixed.

Thanks,
Ramón.

Comment by Derek Wilson [ 26/Aug/15 ]

right - i forgot to mention that because i have collections with very large documents and collections with very small documents, making the chunk size small enough to avoid this issue will mean i have way too few documents per chunk with my larger document collections - for instance i would need to make chunk size about 9.5MB to keep a collection with average doc size of 40B under 250000 docs. But then chunks of collections with a 4k average doc size will only have about 2.4k docs per chunk. and with hundreds of millions of documents that means dozens (to hundreds) of thousands of chunks to manage which could start to cause problems on the other end of the spectrum.

Generated at Thu Feb 08 03:53:17 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.