[SERVER-8996] Queue chunk migrations Created: 15/Mar/13  Updated: 10/Dec/14  Resolved: 04/Jun/13

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: 2.2.3
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Sean Melody Assignee: Unassigned
Resolution: Won't Fix Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Participants:

 Description   

As an adjunct to the ability to pre-split a collection, it would be helpful to be able to specify the migrations that will distribute the chunks to the shards on which they will reside.

The moveChunk command will move a single chunk, but a second moveChunk command will fail if issued before the first migration has finished. Moreover, if the chunk migration fails for any reason, it is not remembered by the cluster and so migrations must be monitored and failed migrations re-requested.

This is particularly noticeable when used with pre-splitting, in which the "chunk migrations" are of empty chunks and therefore do not require data movement, merely changes to the chunks collection.

It would be better to be able to specify a queue of desired migrations, and then let MongoDB do the work to make sure that the migrations happen as requested.

An additional complication for this use case is that these queue requests may come from any client at any time, so the single cluster-wide queue should accept multiple requests and add them to the queue without regard for whether there are already queued requests.

From the original request:

We are struggling with operations to provision and destroy (then re-provision) databases because of the current limitation in Mongo of moving one chunk at a time. We'd like the ability to have move chunk operations (especially empty chunks, which is what are trying to do when provisioning a database) to be queued so that the move operation eventually completes. Right now, the fact that a second move fails (and fail silently) is causing us to consider adding a queuing mechanism to our application code for something that seems like it should be supported in the database.



 Comments   
Comment by Eliot Horowitz (Inactive) [ 04/Jun/13 ]

Also, we did some work in 2.4 to make moving empty chunks almost free, which I think mostly resolves the issue.

Comment by Eliot Horowitz (Inactive) [ 04/Jun/13 ]

Don't think this is a great idea.
moving chunks should really be through of an operation task that is dependent highly on state.

Generated at Thu Feb 08 03:19:01 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.