[SERVER-44088] Autosplitter seems to ignore some fat chunks Created: 17/Oct/19  Updated: 20/Nov/19  Resolved: 20/Nov/19

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 4.0.10
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Sergey Zagursky Assignee: Eric Sedor
Resolution: Incomplete Votes: 1
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
is related to SERVER-13806 Need better detection and reporting o... Closed
is related to SERVER-9287 Decision to split chunk should happen... Closed
is related to SERVER-10024 cluster can end up with large chunks ... Closed
Operating System: ALL
Participants:

 Description   

We have a somewhat large and hot MongoDB cluster. Size of each shard is ~400GiB. Autosplitter is configured to split chunks when they exceed 64 MB (default value). However, I see a lot of much larger chunks. Some chunks are as large as 400 MB. The chunks are in a hot collection that is constantly being inserted and updated. Distribution of hot and cold documents across chunks is even (we use hashed sharding on _id with auto generated ObjectId).

As far as I can see, autosplitter is throttled by getting a token from the token pool with 5 tokens in it. I suspect that in a large shard with unevenly hot collections autosplitter activity is unfairly distributed in favor of hotter collections with a possibility of never getting autosplitter attention for colder collections.

We've hit hard into this issue when added new shards to our cluster and balancer started moving chunks. When balancer spots chunk that is too big, it splits the chunk and balances smaller chunk to new shards. As a result, with evenly distributed chunks across the cluster we have 3x documents in older shards and 1x documents in newer shards.



 Comments   
Comment by Eric Sedor [ 20/Nov/19 ]

That makes sense, sz,

There are a number of known reasons why splits may not occur. We're going to close this ticket as unfortunately we need would need details to perform an investigation about this specific case. But, we can re-open the ticket if you see a large chunk and are able to provide specifics about it.

Alternatively, please let us know if the need for your tool goes goes away in version 4.2. We would be particularly interested in information if the issue persists after the changes to chunk splits in 4.2.

Gratefully,
Eric

Comment by Sergey Zagursky [ 20/Nov/19 ]

@Kelly Lewis, unfortunately I can't provide the information requested now. We had some kind of urge to resolve the situation and I've splitted the chunks using the tool we've written. The tool uses splitVector to choose split points. Therefore I'd not include splitVector as a suspect for my case.

Comment by Kelly Lewis [ 19/Nov/19 ]

Hi sz, can you please provide the information Eric requested for the specific chunk?

Comment by Eric Sedor [ 05/Nov/19 ]

Understood sz. To investigate a specific bug we think it makes sense to focus on whether or not splitVector is accurately selecting a good split point for a chunk being split.

For a specific chunk can you provide:

  • The entry from config.chunks for that chunk prior to a split
  • The entry from config.chunks for the resulting split chunks
  • The results of a count operation using $min and $max to target the range of documents in the resulting split chunks (https://docs.mongodb.com/manual/reference/operator/meta/max/index.html#use-with-min). Note that $min and $max allow you to query on the hashed ranges provided in sh.status() or in the chunk entry.
Comment by Sergey Zagursky [ 30/Oct/19 ]

Eric Sedor, I have identified many such individual chunks. While we do have unexpectedly large chunks in some collections due to bad sharding key choice, there are many chunks that grown beyond any reasonable size because of autosplitter being too passive still.

I made a tool that took all the chunks and splitted those that were too large. It takes a chunk and applies a `splitVector` command to it. Then, if `splitVector` resulted in one or more split points, it used `split` command to split a chunk. As far as I can see, this is exactly how autosplitter works. The tool have helped the balancer to equalize document count across our shards.

Comment by Eric Sedor [ 29/Oct/19 ]

sz,

As dmitry.agranat mentions, there are many improvements not only in 4.2 but planned in the future which will help with balance in sharded clusters. To investigate a specific issue as a bug we would want to understand in detail what has happened to a specific chunk. Are you able to identify a specific chunk that has been split in an un-equal way, or which has grown large while never being considered for a split?

Gratefully,
Eric

Comment by Peter Ivanov [ 25/Oct/19 ]

We're using a configuration where a mongos is set up side-by-side with any service which talks to mongo cluster. So, depending on the instance size we use at a given moment, it varies, but you can safely assume that we're talking dozens and more here. 

Comment by Eric Sedor [ 24/Oct/19 ]

Yea petr.ivanov.s@gmail.com, I mean mongos instances. Sorry for being unclear!

Comment by Peter Ivanov [ 24/Oct/19 ]

Hi, Eric. 

By routers do you mean mongos instances?

Comment by Eric Sedor [ 23/Oct/19 ]

Hi petr.ivanov.s@gmail.com, I wanted to add a question: Can you please let us know how many routers you run in this cluster?

Comment by Peter Ivanov [ 23/Oct/19 ]

dmitry.agranat, does this imply that auto-splitter will eventually process every oversized chunk, even if the thread pool in question was overloaded at the moment of last write to given oversized chunk? 

Comment by Dmitry Agranat [ 22/Oct/19 ]

Hi Sergey,

In 4.2 we moved the auto-splitter to run on the shard primary (SERVER-9287) meaning that instead of a fixed number of tickets it uses a ThreadPool with 20 threads to schedule the split work, which means it will at least be fair between collections and won't completely neglect the colder ones.

Thanks,
Dima

Comment by Eric Sedor [ 21/Oct/19 ]

Hi sz, thanks for this submission. We are looking into it and will likely follow up with some questions. Thanks in advance for your patience.

Eric

Generated at Thu Feb 08 05:04:58 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.