[SERVER-4804] db.dropDatabase() through mongos while a migration is happening can lead to wrong "show dbs" output Created: 28/Jan/12  Updated: 06/Apr/23  Resolved: 27/Jun/19

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Minor - P4
Reporter: Daniel Crosta Assignee: Kaloian Manassiev
Resolution: Done Votes: 9
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

repro'd on OS X 64bit and CentOS 64bit


Issue Links:
Depends
Related
Operating System: ALL
Sprint: Sharding 2019-07-01
Participants:
Case:

 Description   

Steps to reproduce:

1. Bring up a sharded cluster with at least 2 shards
2. Insert a lot of data, such that the collection will be split and chunks migrated.
3. Run "show dbs" (I am using the "blah2" database):

mongos> show dbs
admin	(empty)
blah2	4.9052734375GB
config	0.1875GB
test	0.203125GB

4. Interrupt the data load script, or wait until it completes.
5. While a migration is happening, attempt "db.dropDatabase()" through a mongos
5.a. The first time I tried this, I got:

mongos> db.dropDatabase()
{
	"assertion" : "collection's metadata is undergoing changes. Please try again.",
	"assertionCode" : 13331,
	"errmsg" : "db assertion failure",
	"ok" : 0
}

5.b. I immediately retried and got:

mongos> db.dropDatabase()
{
	"assertion" : "shard state missing for blah2.foo",
	"assertionCode" : 10176,
	"errmsg" : "db assertion failure",
	"ok" : 0
}

5.c. The third try worked:

mongos> db.dropDatabase()
{ "dropped" : "blah2", "ok" : 1 }

6. Run "show dbs" again:

mongos> show dbs
admin	(empty)
blah2	5.951171875GB
config	0.1875GB
test	0.203125GB

7. Verify that data files are missing for all shards (in my case they were)
8. Insert new data with "db.bar.insert(

{hello: "world"}

)"
9. Run "show dbs" again

mongos> show dbs
admin	(empty)
blah2	6.154296875GB
config	0.1875GB
test	0.203125GB

Expected output:

  • In step 6, "blah2" database should not show up
  • In step 9, "blah2" database should not be so large (one would expect it to be 2GB on disk)


 Comments   
Comment by Kaloian Manassiev [ 27/Jun/19 ]

Starting with version 3.4, the dropDatabase operation properly drops all collections one-by-one and acquires the collection distributed lock, so it will serialize with chunk migration. This is is different from the 3.2 implementation, which directly dropped the database entries from the config server metadata.

Please note that due to SERVER-17397, there are still edge cases where dropping a database or a collection may leave partial data behind.

Comment by Stefan Stark [ 26/Oct/16 ]

Interessting, this might be an authorization problem then. Because I can use the db.dropDatabase() command with the built-in clusterAdmin role, but I cannot listCollections or dropIndex. (https://docs.mongodb.com/manual/reference/built-in-roles/#cluster-administration-roles)
Does that mean I need to create a user for each database I want to drop? I have databases with >50 collections, which makes ~250 indices. So i will have to write a script that drops them for me.
So why exactly is this a minor issue that has not been solved in 4 years?

Comment by Dave Muysson [ 26/Oct/16 ]

We've worked out a process to give us the greatest chance of success for dropping databases. Since implementing it, we haven't had a failed/ghost database yet.
1. Stop the mongoS balancer.
2. Drop all indexes from each collection within the database
3. Drop the database.

One of our guys dug through the code and found that at some point the DB drop process issued a call to drop an index, but didn't wait/check for success. He thinks that a failure of this action could lead to a partial (and failed) drop. Once we started dropping the indexes ourselves (which sometimes fails, and then works upon retry), our DB's are dropped without issue.

Hope this helps others out there.

Comment by Stefan Stark [ 26/Oct/16 ]

still exists in 3.2.10 on windows server 2012

Comment by Dave Muysson [ 18/Sep/15 ]

Same problem, Ubuntu 12.04 LTS, mongodb 2.4.14 and 2.6.10 (two different environments). Following the workaround mentioned above at stackoverflow.com did not work for us either.

The 2.4.14 cluster was fully restarted following the drop.

Comment by Ramon Orru [ 19/Feb/15 ]

We are experiencing this problem too.
We run an Ubuntu 14.04.1 LTS server, using a mongo 2.6.7 version.
Previously posted workaround seems to be unuseful for us....

Comment by Christian Tonhäuser [ 27/Aug/12 ]

We seem to be running into the same issue.
Since our software is routinely creating and deleting databases, we cannot use the workaround found here:
http://stackoverflow.com/questions/9407838/mongodb-dropdatabase-not-working

Comment by Scott Yancey [ 30/Jan/12 ]

FYI, your lists "Environment: OS X 64bit" but we're reproducing this on 64 bit CentOS.

Comment by Daniel Crosta [ 30/Jan/12 ]

Additionally, sometimes dropDatabase will report success, but the database (and data files) still exist:

mongos> use blah2
switched to db blah2
mongos> db.getSisterDB("config").locks.find({_id: "balancer", state: 2})
{ "_id" : "balancer", "process" : "dcrosta.local:27018:1327959228:16807", "state" : 2, "ts" : ObjectId("4f2718fff8e76711e5957f11"), "when" : ISODate("2012-01-30T22:26:07.202Z"), "who" : "dcrosta.local:27018:1327959228:16807:Balancer:282475249", "why" : "doing balance round" }
mongos> db.dropDatabase()
{ "dropped" : "blah2", "ok" : 1 }
mongos> show dbs
admin	(empty)
blah2	5.9052734375GB
config	0.046875GB
test	0.203125GB

Comment by Daniel Crosta [ 30/Jan/12 ]

Trying to reproduce this again, this time I got a new error for step 5:

mongos> use blah2
switched to db blah2
mongos> db.dropDatabase()
{
	"errmsg" : "exception: DBClientBase::findN: transport error: localhost:30001 ns: blah2.$cmd query: { dropDatabase: 1 }",
	"code" : 10276,
	"ok" : 0
}

After which "show dbs" shows a partially-emptied db:

mongos> show dbs
admin	(empty)
config	0.046875GB
blah2	11.90234375GB
test	0.203125GB

And in this case, I can verify that for 2 of the three shards, data files still exist:

(master) dcrosta@dcrosta:~/dev/10gen/mongo-snippets/db ± ls shard_*/*blah*
shard_1/blah2.0  shard_1/blah2.2  shard_1/blah2.4  shard_1/blah2.6  shard_3/blah2.0  shard_3/blah2.2  shard_3/blah2.4  shard_3/blah2.6
shard_1/blah2.1  shard_1/blah2.3  shard_1/blah2.5  shard_1/blah2.ns shard_3/blah2.1  shard_3/blah2.3  shard_3/blah2.5  shard_3/blah2.ns

The "blah2" database does not show up in config.databases, but "blah2.foo" shows up in config.collections with dropped == true. Further attempts to drop the database through mongos do not remove the spare data files nor change the output of "show dbs". I can, however, connect directly to the shards and drop the database there. This then causes the output of "show dbs" to become correct.

Comment by Daniel Crosta [ 28/Jan/12 ]

(cleaned up formatting)

Generated at Thu Feb 08 03:07:02 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.