[SERVER-14298] Add support to create/define chunks in a given shard for empty collections Created: 18/Jun/14 Updated: 06/Dec/22 Resolved: 05/Nov/21 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Sharding |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | New Feature | Priority: | Major - P3 |
| Reporter: | Jose Luis Pedrosa | Assignee: | [DO NOT USE] Backlog - Sharding Team |
| Resolution: | Done | Votes: | 0 |
| Labels: | lamont-triage, nc | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Assigned Teams: |
Sharding
|
||||||||
| Participants: | |||||||||
| Description |
|
Hi All When pre-spliting a large collection that requires a significant number of chunks, the procedure of splitting and then moving chunks to the desired shard can take a significant amount of time. In some cases can be done by making an small initial split and move, and then split with a seed in each chunk, this is not always possible. As config.chunks collection is not part of the "API", it would be nice to have a way to instruct the DB to create the chunks in a determined shard. Possible options: , max : { key : value }, shard : name } on the shardCollection command, this only solves the initial creation. 2) Allow specify the destination shard of a chunk in a split if the left or right side are empty. |
| Comments |
| Comment by Greg Studer [ 20/Oct/14 ] | ||||||
|
See the comments here: This is definitely something that is planned, it just requires moving shardCollection to mongod to implement safely. Mongos isn't capable of holding the locks required to ensure that no data is written to the collection on the primary while we're creating chunks elsewhere. | ||||||
| Comment by Jose Luis Pedrosa [ 20/Jun/14 ] | ||||||
|
Hi Thomas Indeed you explained it much more clearly, it is exactly what I ment. Answering your question, the case when I think it's not possible to create the chunks without a lot of moves, is when the desired chunk distribution is ciclic to the shards:
One example could be dates, each day goes to one shard, so 2a would be suefull for initial creation and 2b for maintenance: Always the shard for $maxKey will be a single chunk, if you split that chunk the new chunk now lives on the same shard... so this forces to move empty chunks. This options is what I could think of to improve shard/chunk management, feel free to change it to any other approach. Thanks! | ||||||
| Comment by Thomas Rueckstiess [ 19/Jun/14 ] | ||||||
|
Hi Jose, I'd like to better understand your feature request and need some more information. You say:
1. Can you explain under which circumstances it's not possible to create n chunks (for n shards), move each of them to a shard and then continue the splits locally? 2. Do I understand correctly that you want to enhance the shardCollection and split command to do the following: a) For the shardCollection command, provide the initial chunk distribution manually. This requires an additional parameter that takes a list of chunks (with min and max) for each shard. The initial chunks are created and placed according to the provided distribution. If the full range from minKey to maxKey is not covered or if there are overlaps, the command aborts with an error. b) enhance the split command to be able to provide a target shard for splits on a boundary. The empty chunk of the two new chunks will be directly placed on the target shard. Both of these changes assume that there is no data, either in the collection (2a) or in one of the new chunks resulting from a split (2b), therefore the metadata can be written directly without having to migrate data. If this what you had in mind? If not, can you please elaborate on the requested changes? Thanks, |