[SERVER-17397] Dropping a Database or Collection in a Sharded Cluster may not fully succeed Created: 26/Feb/15 Updated: 25/Oct/23 Resolved: 09/Jul/21 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Sharding |
| Affects Version/s: | 3.0.16, 3.4.18, 3.6.9, 4.0.5 |
| Fix Version/s: | 5.0.0 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Peter Garafano (Inactive) | Assignee: | [DO NOT USE] Backlog - Sharding EMEA |
| Resolution: | Done | Votes: | 56 |
| Labels: | ShardingAutomationSupport, stop-orphaning-fallout | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Assigned Teams: |
Sharding EMEA
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Case: | (copied to CRM) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Description |
|
Issue Status as of Sep 18, 2020 ISSUE SUMMARY USER IMPACT WORKAROUNDS To work around this issue one can follow the steps below to drop a database/collection in a sharded environment. MongoDB 4.4:
MongoDB 4.2:
MongoDB 4.0 and earlier:
|
| Comments |
| Comment by Tommaso Tocci [ 09/Jul/21 ] | |||||||
|
As part of a project to start using reliable coordinators for sharded DDL we made both drop database and collection operations resilient to crashes, stepdowns and network partitions. The new implementation guarantees that if a drop database/collection operation returns successfully to the client, all the data and metadata associated with that db/collection have been correctly deleted and the namespace could be safely reused immediately. In other words if a drop database/collection operation starts deleting any data it will eventually delete all the data and leave the cluster in a consistent state. | |||||||
| Comment by Githook User [ 01/Apr/20 ] | |||||||
|
Author: {'name': 'Oleg Pudeyev', 'email': '39304720+p-mongo@users.noreply.github.com', 'username': 'p-mongo'}Message:
needed since we are now doing strict comparisons
Co-authored-by: Oleg Pudeyev <oleg@bsdpower.com> | |||||||
| Comment by Sheeri Cabral (Inactive) [ 12/Dec/19 ] | |||||||
|
Note: there has been work done so that in 4.2 all that is needed is to re-drop the database and flushRouterConfig on all the mongos. In 4.4, all that is needed is to re-drop the database. This issue remains open as we decide if backporting to versions 4.0 and earlier is possible. There is a workaround for versions 4.0 and earlier, so those who are on 4.0 and below can recover if needed. | |||||||
| Comment by Kaloian Manassiev [ 16/Jan/18 ] | |||||||
|
Thank you for your question. Unfortunately, as it stands now, the zones (tags) will be left around after a collection drop. I have filed You are correct that the workaround steps should include a db.tags.remove({ns: 'DATABASE.COLLECTION'}). I am going to update the workaround steps above. In the mean time feel free to monitor Best regards, | |||||||
| Comment by Rajat Mishra [ 15/Jan/18 ] | |||||||
|
We have implemented a tag aware sharding and created tags for each shard. In the steps given for workaround, do we also also need to remove the documents from tags collection present in config database. | |||||||
| Comment by Andy Schwerin [ 11/May/17 ] | |||||||
|
Direct work on this problem is not scheduled at present, but enabling work on the shard and distributed catalogs is taking place during 3.6 development. | |||||||
| Comment by Clive Hill [ 09/May/17 ] | |||||||
|
What are the plans in terms of resolving this issue? Is work being scheduled on this? | |||||||
| Comment by Kelsey Schubert [ 27/Mar/17 ] | |||||||
|
Hi gauravps, It is not possible for issue described by this ticket to affect non-sharded clusters. Please open a new SERVER ticket and supply additional details about this behavior (MongoDB Version, storage engine, how you observe that the database has not been deleted), and we will be happy to investigate. Thank you, | |||||||
| Comment by Gaurav Shellikeri [ 24/Mar/17 ] | |||||||
|
@ramon.fernandez, is there any chance this might be affecting replicated setups? We have a cluster of three mongod nodes with one of them set to master. ~ once in 24 hours, we see that a deleted database (deleted using dropDatabase from our regression runner scripts) does not actually get deleted. We are re-using the same database name so we can identify who that database belongs to. | |||||||
| Comment by James Blackburn [ 14/Oct/16 ] | |||||||
|
Would be good to have this fixed. It causes all sorts of problems with real-world workloads. | |||||||
| Comment by Henrik Hofmeister [ 27/May/16 ] | |||||||
|
Seeing same issue in: db version v3.0.11 This severely affects performance (we're creating and dropping db's as part of an integration test process). | |||||||
| Comment by Ramon Fernandez Marina [ 27/Apr/16 ] | |||||||
|
andrewdoumaux@gmail.com, sorry to hear you're being affected by this issue. The biggest impact of this bug is if you attempt to reuse the namespace, which I'd recommend against. As described above, dropping collections may yield stale metadata that should not have a significant impact on your cluster as long as the namespace is not reused, but if sh.status() is impacted due to having a large number of collections then the only workaround is, unfortunately, to clean up the orphan metadata as described above. Regards, | |||||||
| Comment by Andrew Doumaux [ 27/Apr/16 ] | |||||||
|
My issue might be somewhat related, and not sure when its going to fully bite me. So my use case is caching analytic output in MongoDB. Since we have no good way to know what data has changed between analytic runs, we load the data into a new shared collection and once the data has been fully loaded and replicated, we drop the old/previous collection and via an aliasing process the service layer starts reading from the new collection. However, in my current environment we are creating and dropping 50+ collections a day. Thus over the course of a year there will be an ~20k documents in the "config.collections" collection. This does seem to impact sh.status() since it does a find() across the config.collections collection. Are there any good means of cleaning up a dropped sharded collection? Or at this point is the work-around state here the best option to clean up orphaned metadata? | |||||||
| Comment by Ramon Fernandez Marina [ 07/Apr/15 ] | |||||||
|
Thanks for the update paulgpa, glad to hear you were able to make progress. Please note that while completing step (2) is sufficient to reclaim disk space, unless you also complete (3) and (4) it is highly likely that you'll run into trouble if you attempt to reuse the dropped namespace. | |||||||
| Comment by Pavlo Grinchenko [ 06/Apr/15 ] | |||||||
|
Thank you, Ramon. Doing (2) did help us to remove what's left from the database that we wanted to remove. We didn't do the router drill. | |||||||
| Comment by Ramon Fernandez Marina [ 03/Apr/15 ] | |||||||
|
paulgpa, the data files are removed in step 2; I've amended the ticket's summary box to reflect that. Please see the documentation on the config database for more information. In step 3 you can use find() to find all the references to the namespace that was not successfully removed, and remove() to delete the relevant documents. For example, if I had
and test is the database that I needed to drop, I would run:
If you need further assistance on the details please post on the mongodb-user group or Stack Overflow with the mongodb tag, where your question will reach a larger audience. A question like this involving more discussion would be best posted on the mongodb-user group. Regards, | |||||||
| Comment by Pavlo Grinchenko [ 03/Apr/15 ] | |||||||
| |||||||
| Comment by Pavlo Grinchenko [ 03/Apr/15 ] | |||||||
|
We need some practical recommendations. We do understand that it will be fixed some day, but we have a disk issue within 1 week. Can we simply remove files for the dropped databases (successfully) from the shard hosts? |