[SERVER-23034] Large number of collections causes extreme dropDatabase slowness Created: 08/Mar/16  Updated: 09/Jun/16  Resolved: 11/Apr/16

Status: Closed
Project: Core Server
Component/s: WiredTiger
Affects Version/s: 3.2.3
Fix Version/s: None

Type: Bug Priority: Critical - P2
Reporter: Justin Lintz Assignee: Kelsey Schubert
Resolution: Cannot Reproduce Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Ubuntu 12.04 Linux
EC2 - i2.4xlarge
Kernel 3.13.0-79-generic

Set of 3 in a replicaset


Attachments: File diagnostic.tar.gz     File mongo.conf     Text File mongo.log     File mongo_collection_creation.js    
Operating System: Linux
Participants:

 Description   

Create a database with few thousand collections, try and drop the database. For a database of only a few hundred megabytes, took over a half hour to delete. "show dbs" was hung during that time as well. If I just create a database with one collection and one document, I can delete things just fine.

I'm currently graphing all metrics from serverStatus so let me know if theres any values you think are relevant to help debugging the issue. I'll add more information to this issue as I find it since I'm still running tests but wanted to get the discussions going.



 Comments   
Comment by Kelsey Schubert [ 11/Apr/16 ]

Hi jlintz,

We haven’t heard back from you for some time, so I’m
going to resolve this ticket. If this is still an issue for you please provide the iostats and we will take another look.

Kind regards,
Thomas

Comment by Kelsey Schubert [ 18/Mar/16 ]

Hi jlintz,

Thank you for providing a repro script. Unfortunately, I have not been able to reproduce this issue: I increased the number of collections in your repro script to 25,000 and observed that the dropDatabase command took about 30 seconds to complete.

replset:PRIMARY> show dbs
local  0.002GB
test   0.192GB
replset:PRIMARY> db.stats()
{
        "db" : "test",
        "collections" : 25001,
        "objects" : 25000,
        "avgObjSize" : 41,
        "dataSize" : 1025000,
        "storageSize" : 103018496,
        "numExtents" : 0,
        "indexes" : 25001,
        "indexSize" : 102637568,
        "ok" : 1
}
replset:PRIMARY> db.dropDatabase()
{ "dropped" : "test", "ok" : 1 }

2016-03-18T13:50:01.988-0400 I COMMAND  [conn26] command local.oplog.rs command: dropDatabase { dropDatabase: 1.0 } keyUpdates:0 writeConflicts:0 numYields:0 reslen:40 
locks:{ Global: { acquireCount: { r: 4, w: 3, W: 1 } }, Database: { acquireCount: { w: 2, W: 1 } }, Metadata: { acquireCount: { w: 1 } }, oplog: { acquireCount: { w: 2 } } } protocol:op_command 28999ms

To continue to investigate this issue, can you please execute the following shell script before dropping the database:

delay=1
iostat -k -t -x ${delay:?} >iostat.log &

This will collect iostat data each second, and will help us understand what is going on. After dropping the database, please upload both the iostat.log and diagnostic.data to this ticket.

Thank you,
Thomas

Comment by Justin Lintz [ 11/Mar/16 ]

Just confirmed the issue is present as a standalone instance as well

Comment by Justin Lintz [ 11/Mar/16 ]

I've created a javascript file that re-create's the issue. I've attached the JS file, diagnostics directory and a copy of the config file to this issue and a couple relevant log lines. So far I've only tested in a replica set of 3 machines, I'll try and test on a standalone instance and get back to you

Comment by Ramon Fernandez Marina [ 10/Mar/16 ]

jlintz, can you please upload the contents of the diagnostic.data directory from your dbpath?

Can you also provide details on the size of those collections and the average amount of indexes in them? Any other information you can provide that may help us reproduce this issue on our end will be helpful.

Thanks,
Ramón.

Comment by Susan LoVerso [ 09/Mar/16 ]

Hello jlintz. Could you create a js script that shows the problem? Does this happen also on a single standalone mongod instead of with replication?

It would be helpful to reproduce it here. Something that creates some number of databases, then creates the large number of collections in each database, inserts the data and then does the drop.

Also, if you want to attach a tar of the serverStatus FTDC data, we could look at that.

Comment by Justin Lintz [ 08/Mar/16 ]

Also should note, there is a ton of disk write activity during the drop, around 125 MB/sec or more of constant writes

Generated at Thu Feb 08 04:02:09 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.