[SERVER-19495] Can't split hash sharded collection with empty chunks Created: 20/Jul/15 Updated: 25/Aug/15 Resolved: 25/Aug/15 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Sharding |
| Affects Version/s: | 2.6.9 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Aaron Westendorf | Assignee: | Randolph Tan |
| Resolution: | Incomplete | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Operating System: | ALL | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Steps To Reproduce: | In a 2.6.9 cluster, start with a sharded collection that has about 580 documents with an average size of 320 bytes. Use hashed sharding and ensure that it starts with 1 chunk. The following python is our code to split the chunks:
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Participants: |
| Description |
|
We have an existing collection with a small number of records that will soon see a lot more. It was already sharded on a hashed application guid. Our goal is equivalent to a pre-split of an empty collection; we want to match the number of chunks to the new expected use of this collection. We're using a splitting routine that lists the chunks in the config server, and for each chunk, calls split with the bounds. We were able to split this collection until we hit a point where it seems that there's an empty chunk which we can't split.
The only correlated log we can find across all mongo processes is this from the primary in that shard:
Our expectation is that, even if the chunk is empty, the key range can be split without a problem. |
| Comments |
| Comment by Ramon Fernandez Marina [ 10/Aug/15 ] |
|
aaron.westendorf, we haven't heard back from you for some time. If this is still an issue for you can you please answer Randolph's questions above? Thanks, |
| Comment by Randolph Tan [ 28/Jul/15 ] |
|
Hi, Can you clarify by what you mean by it didn't do anything? Did you get an error from it? I have also contacted the docs team to revise the "do not use middle" for hashed shard keys.
That is not how split with bounds work right now. The shard key of the median is selected as the split point. So, if there are no documents in the chunk range, then there will be no median to select from. Thanks! |
| Comment by Aaron Westendorf [ 27/Jul/15 ] |
|
We tried middle, it didn't do anything on a hash shard key. In the documentation you linked to:
Because a hashed chunk is actually defined by the range of hashed integers that will end up in the chunk, I expect that it should split that range, such that range(a, b) splits to range(a, a+(b-a)/2) , range(a+(b-a)/2, b). |
| Comment by Randolph Tan [ 27/Jul/15 ] |
|
Hi, The "bounds" field is similar to the "find" field that it needs some data in the chunk because the split point is determined based on the number of documents in the chunk. If you want to perform a split regardless of whether a chunk is empty or not, you should use the "middle" field. Also see: http://docs.mongodb.org/manual/reference/command/split/#considerations |