[SERVER-9845] Command to unshard a collection Created: 01/Jun/13 Updated: 01/Jun/16 Resolved: 03/Jun/13 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Sharding |
| Affects Version/s: | 2.4.3 |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Taha Jahangir | Assignee: | Steffan Mejia |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||||||
| Description |
|
I think these commands can be used to unshard a collection with moving all chunks to primary of database.
Is this a correct code? Did I missed something or not? |
| Comments |
| Comment by Scott Hernandez (Inactive) [ 03/Jun/13 ] |
|
Also, if you are going to one of our many conferences (http://www.10gen.com/events?type=event_mongodbday) that would be a good place to learn and talk about these kinds of things. Also, if you happen to find yourself in Palo Alto, London, or New York (and more) then you can drop by any week for office hours: http://www.10gen.com/office-hours |
| Comment by Scott Hernandez (Inactive) [ 03/Jun/13 ] |
|
This is not the best place to have a discussion about nor explain how the system works – this is where bugs are filed and feature requests are requested. Please ask questions like this on the mongodb-dev or users forum: https://groups.google.com/group/mongodb-dev As I mentioned you will need to restart the shards to clear their metadata about sharded collections, and the shard key information, which is very important. There are safety check built into the sharding system that use version information in the config data which may not be valid after making manual changes like you have done so please do not make changes unless they are fully understood and tested. I'm glad what you did seemed to have worked, but I suspect there are problems there that you have have missed. I would suggest you not do anything like this in production, with real data, until there is official support. |
| Comment by Taha Jahangir [ 02/Jun/13 ] |
|
I think metadata is cached only on mongos servers, But why we should shutdown all shards? Shutting all mongoses is not enough? Actually we DID this procedure, and after small down time (for restarting all mongos servers), everything went OK. I confused about caching metadata in mongos servers.
p.s. I saw lots of warnings about differing content of config servers, during running initial `shardCollection` command. (after a message like `going to create 6672 chunk(s) for: ...`) Is this a normal warning?! It disappeared after `shardCollection` command returned successfully. |
| Comment by Scott Hernandez (Inactive) [ 02/Jun/13 ] |
|
You are correct, moving chunks is safe do at runtime and is a background operation which clients will not even notice. That is not really the issue here though. There are some other constraints to sharded collections which make this a little challenging to do without having direct support for it in the system. Because of the way the configurations and sharding metadata is loaded/cached/validated there is currently no clean way to remove the knowledge that the collection is sharded, and the shard key information for the collection (pretty much that it is sharded), from all active nodes at once. This means that in order to transition to a sharded collection on shard key "1", to an unsharded collection, and then shard on key "2" (to reshard the collection on a new shard key) there can be many errors between those states as you manually edit the metadata. Since we do not support this, nor test this, I cannot provide you with information about all these errors, but I do know that they exist and will most likely occur on an active system. The best suggestion I can offer is that you schedule a maintenance window where there are no writes to the system and you can restart the sharded cluster to ensure that all metadata changes are cleanly applied (consistently) across all members (mongos instances as well as the shards which hold the data). Additional StepsIn your steps, before doing the "unshard" (after you move all chunks to a single shard and are about to change the config db metadata manually), you should shutdown all shards and all but one mongos which you are going to use to edit the metadata. On the one mongos instance you use to change the metadata you will also need to restart it once you finish your changes and before starting the shards up again. You can also use that one mongos to validate and test that the collection is unsharded once you restart the shards but before you restart all mongos instances. WarningI am hesitant to even mention this much of a process as it is very possible to make mistakes in manual changes like this which will lead to data going to the wrong shards and being lost. As I said, this is all unsupported (and untested) so I would suggest testing someplace first to ensure you know each step, how to handle any errors and have a plan to revert back to a backup of the config database in the worst case. And yes, backups are an important part of the process before you start making any changes. |
| Comment by Taha Jahangir [ 01/Jun/13 ] |
|
Main reason is re-sharding using a different key. And we want to do it on a live system I think (mentioned in docs) that `moveChunk` is capable to handle changed data while moving chunks, isn't it? |
| Comment by Scott Hernandez (Inactive) [ 01/Jun/13 ] |
|
That is pretty close. Why are you unsharding the collection? Do you intend to do this on a live system while data is being added? |