[SERVER-20638] Reading the profiling level shouldn't create databases that don't exist Created: 23/Sep/15  Updated: 28/Oct/15  Resolved: 30/Sep/15

Status: Closed
Project: Core Server
Component/s: Replication, Storage
Affects Version/s: None
Fix Version/s: 3.0.7, 3.1.9

Type: Bug Priority: Major - P3
Reporter: Cailin Nelson Assignee: Mathias Stearn
Resolution: Done Votes: 0
Labels: mms-s
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: Text File mongodb.log.2015-09-16T23-18-52.txt    
Issue Links:
Duplicate
duplicates SERVER-13212 ListDatabases implementation can caus... Closed
Backwards Compatibility: Minor Change
Operating System: ALL
Backport Completed:
Sprint: QuInt A (10/12/15)
Participants:

 Description   

Currently the profile command always creates the database even if it doesn't exist. The plan is to keep that behavior when setting the profiling level, but to return the server-wide default level without creating the database when just reading the profiling level.

Original Description:

I am seeing hundreds of instances of failures to drop a database on a secondary in our cloud-dev environment. Here is one example.

Primary (WT)

mmsdev:PRIMARY>  db.runCommand({listDatabases:1}).databases.forEach(function (x) {if (x.name == "mmsdbrrd-raw-dbs-PT5S-PT2H-20150916T1310"){print(x.name)}})
mmsdev:PRIMARY> 

ubuntu@ip-10-169-132-210:/data/mmsdev_4$ zgrep -i drop mongodb.log*  | grep "mmsdbrrd-raw-dbs-PT5S-PT2H-20150916T1310"
 
mongodb.log.2015-09-16T19-01-14.gz:2015-09-16T15:16:29.180+0000 I COMMAND  [conn151320] dropDatabase mmsdbrrd-raw-dbs-PT5S-PT2H-20150916T1310 starting
mongodb.log.2015-09-16T19-01-14.gz:2015-09-16T15:16:29.206+0000 I COMMAND  [conn151320] dropDatabase mmsdbrrd-raw-dbs-PT5S-PT2H-20150916T1310 finished

Longer snippet.... note that it is dropped only once (but there are two collections here that have very similar names, dropped back to back).

2015-09-16T15:16:26.809+0000 I ACCESS   [conn152725] Successfully authenticated as principal __system on local
2015-09-16T15:16:28.825+0000 I ACCESS   [conn22793] Successfully authenticated as principal mms-monitoring-agent on admin
2015-09-16T15:16:28.936+0000 I COMMAND  [conn151317] command admin.$cmd command: listDatabases { listDatabases: 1 } keyUpdates:0 writeConflicts:0 numYields:0 reslen:33954 locks:{ Global: { acquireCount: { r: 842 } }, Database: { acquireCount: { r:
 421 } } } 191ms
2015-09-16T15:16:28.981+0000 I COMMAND  [conn151317] dropDatabase mmsdbrrd-raw-PT5S-PT2H-20150916T1310 starting
2015-09-16T15:16:29.002+0000 I COMMAND  [conn151317] dropDatabase mmsdbrrd-raw-PT5S-PT2H-20150916T1310 finished
2015-09-16T15:16:29.133+0000 I COMMAND  [conn151320] command admin.$cmd command: listDatabases { listDatabases: 1 } keyUpdates:0 writeConflicts:0 numYields:0 reslen:33869 locks:{ Global: { acquireCount: { r: 842 }, acquireWaitCount: { r: 1 }, time
AcquiringMicros: { r: 22655 } }, Database: { acquireCount: { r: 421 } } } 382ms
2015-09-16T15:16:29.180+0000 I COMMAND  [conn151320] dropDatabase mmsdbrrd-raw-dbs-PT5S-PT2H-20150916T1310 starting
2015-09-16T15:16:29.206+0000 I COMMAND  [conn151320] dropDatabase mmsdbrrd-raw-dbs-PT5S-PT2H-20150916T1310 finished
2015-09-16T15:16:29.235+0000 I COMMAND  [conn22793] command admin.$cmd command: listDatabases { listDatabases: 1 } keyUpdates:0 writeConflicts:0 numYields:0 reslen:33780 locks:{ Global: { acquireCount: { r: 842 }, acquireWaitCount: { r: 2 }, timeAcquiringMicros: { r: 50004 } }, Database: { acquireCount: { r: 421 } } } 389ms
2015-09-16T15:16:50.383+0000 I NETWORK  [conn152696] end connection 10.169.132.210:50055 (118 connections now open)
2015-09-16T15:16:50.383+0000 I NETWORK  [conn152697] end connection 10.169.132.210:50056 (118 connections now open)
2015-09-16T15:16:53.636+0000 I NETWORK  [initandlisten] connection accepted from 10.169.132.210:50106 #152726 (118 connections now open)
2015-09-16T15:16:53.650+0000 I ACCESS   [conn152726] Successfully authenticated as principal __system on local
2015-09-16T15:16:56.293+0000 I NETWORK  [conn152724] end connection 10.218.181.240:52421 (117 connections now open)

Secondary (3.0.6 mmapv1)

mmsdev:SECONDARY> db.runCommand({listDatabases:1}).databases.forEach(function (x) {if (x.name == "mmsdbrrd-raw-dbs-PT5S-PT2H-20150916T1310"){print(x.name)}})
mmsdbrrd-raw-dbs-PT5S-PT2H-20150916T1310

ubuntu@ip-10-218-181-240:/data/mmsdev_1$  zgrep -i drop mongodb.log*  | grep "mmsdbrrd-raw-dbs-PT5S-PT2H-20150916T1310"
mongodb.log.2015-09-16T23-18-52.gz:2015-09-16T15:16:29.630+0000 I COMMAND  [repl writer worker 2] dropDatabase mmsdbrrd-raw-dbs-PT5S-PT2H-20150916T1310 starting
mongodb.log.2015-09-16T23-18-52.gz:2015-09-16T15:16:29.660+0000 I COMMAND  [repl writer worker 2] dropDatabase mmsdbrrd-raw-dbs-PT5S-PT2H-20150916T1310 finished

Longer snippet.... note that it is dropped only once (but there are two collections here that have very similar names, dropped back to back). After the drop, the collection is recreated.

2015-09-16T15:16:29.219+0000 I COMMAND  [repl writer worker 2] dropDatabase mmsdbrrd-raw-PT5S-PT2H-20150916T1310 starting
2015-09-16T15:16:29.241+0000 I JOURNAL  [repl writer worker 2] journalCleanup...
2015-09-16T15:16:29.241+0000 I JOURNAL  [repl writer worker 2] removeJournalFiles
2015-09-16T15:16:29.249+0000 I JOURNAL  [repl writer worker 2] journalCleanup...
2015-09-16T15:16:29.249+0000 I JOURNAL  [repl writer worker 2] removeJournalFiles
2015-09-16T15:16:29.253+0000 I COMMAND  [repl writer worker 2] dropDatabase mmsdbrrd-raw-PT5S-PT2H-20150916T1310 finished
2015-09-16T15:16:29.440+0000 I INDEX    [repl writer worker 2] allocating new ns file /data/mmsdev_1/mmsdbrrd-raw-dbs-PT5S-PT2H-20150916T1530.ns, filling with zeroes...
2015-09-16T15:16:29.570+0000 I STORAGE  [FileAllocator] allocating new datafile /data/mmsdev_1/mmsdbrrd-raw-dbs-PT5S-PT2H-20150916T1530.0, filling with zeroes...
2015-09-16T15:16:29.573+0000 I STORAGE  [FileAllocator] done allocating datafile /data/mmsdev_1/mmsdbrrd-raw-dbs-PT5S-PT2H-20150916T1530.0, size: 64MB,  took 0.002 secs
2015-09-16T15:16:29.593+0000 I COMMAND  [conn81] command mmsdbrrd-raw-dbs-PT1H-PT1536H-20150828T0000.$cmd command: dbStats { dbstats: 1 } keyUpdates:0 writeConflicts:0 numYields:0 reslen:347 locks:{ Global: { acquireCount: { r: 2 }, acquireWaitCount: { r: 1 }, timeAcquiringMicros: { r: 141186 } }, MMAPV1Journal: { acquireCount: { r: 1 } }, Database: { acquireCount: { R: 1 } }, Metadata: { acquireCount: { R: 1 } } } 154ms
2015-09-16T15:16:29.630+0000 I COMMAND  [repl writer worker 2] dropDatabase mmsdbrrd-raw-dbs-PT5S-PT2H-20150916T1310 starting
2015-09-16T15:16:29.653+0000 I JOURNAL  [repl writer worker 2] journalCleanup...
2015-09-16T15:16:29.653+0000 I JOURNAL  [repl writer worker 2] removeJournalFiles
2015-09-16T15:16:29.657+0000 I JOURNAL  [repl writer worker 2] journalCleanup...
2015-09-16T15:16:29.657+0000 I JOURNAL  [repl writer worker 2] removeJournalFiles
2015-09-16T15:16:29.660+0000 I COMMAND  [repl writer worker 2] dropDatabase mmsdbrrd-raw-dbs-PT5S-PT2H-20150916T1310 finished
2015-09-16T15:16:29.718+0000 I COMMAND  [conn30216] command admin.$cmd command: listDatabases { listDatabases: 1 } keyUpdates:0 writeConflicts:0 numYields:0 reslen:33642 locks:{ Global: { acquireCount: { r: 832 }, acquireWaitCount: { r: 30 }, timeAcquiringMicros: { r: 727351 } }, MMAPV1Journal: { acquireCount: { r: 416 } }, Database: { acquireCount: { r: 416 } } } 846ms
2015-09-16T15:16:31.094+0000 I INDEX    [conn30216] allocating new ns file /data/mmsdev_1/mmsdbrrd-raw-dbs-PT5S-PT2H-20150916T1310.ns, filling with zeroes...
2015-09-16T15:16:31.213+0000 I STORAGE  [FileAllocator] allocating new datafile /data/mmsdev_1/mmsdbrrd-raw-dbs-PT5S-PT2H-20150916T1310.0, filling with zeroes...
2015-09-16T15:16:31.219+0000 I STORAGE  [FileAllocator] done allocating datafile /data/mmsdev_1/mmsdbrrd-raw-dbs-PT5S-PT2H-20150916T1310.0, size: 64MB,  took 0.005 secs
2015-09-16T15:16:31.229+0000 I COMMAND  [conn30216] command mmsdbrrd-raw-dbs-PT5S-PT2H-20150916T1310.$cmd command: profile { profile: -1 } keyUpdates:0 writeConflicts:0 numYields:0 reslen:58 locks:{ Global: { acquireCount: { r: 1, w: 1 } }, MMAPV1Journal: { acquireCount: { w: 5 } }, Database: { acquireCount: { W: 1 } }, Metadata: { acquireCount: { W: 1 } } } 137ms
2015-09-16T15:16:31.322+0000 I COMMAND  [conn81] command mmsdbrrd-raw-dbs-PT5S-PT2H-20150916T1420.$cmd command: dbStats { dbstats: 1 } keyUpdates:0 writeConflicts:0 numYields:0 reslen:344 locks:{ Global: { acquireCount: { r: 2 }, acquireWaitCount: { r: 1 }, timeAcquiringMicros: { r: 112254 } }, MMAPV1Journal: { acquireCount: { r: 1 } }, Database: { acquireCount: { R: 1 } }, Metadata: { acquireCount: { R: 1 } } } 113ms

The listDatabases command here is from the monitoring agent, it's a complete coincidence that it arrives in the vicinity of these drops. It is not executed by the Java application code that is dropping databases.

ubuntu@ip-10-218-181-240:/data/mmsdev_1$ ls -lt | grep mmsdbrrd-raw-dbs-PT5S-PT2H-20150916T1310 | tail
-rw------- 1 mongodb mongodb   67108864 Sep 16 15:16 mmsdbrrd-raw-dbs-PT5S-PT2H-20150916T1310.0
-rw------- 1 mongodb mongodb   16777216 Sep 16 15:16 mmsdbrrd-raw-dbs-PT5S-PT2H-20150916T1310.ns

Full secondary log: https://jira.mongodb.org/secure/attachment/91141/mongodb.log.2015-09-16T23-18-52.txt



 Comments   
Comment by Githook User [ 30/Sep/15 ]

Author:

{u'username': u'RedBeard0531', u'name': u'Mathias Stearn', u'email': u'mathias@10gen.com'}

Message: SERVER-20638 Reading profiling level shouldn't create databases that don't exist

Setting the profiling level will still create the database.

(cherry picked from commit e5883822fd2fc777fd3e3e6b766f54e2caa8e594)
Branch: v3.0
https://github.com/mongodb/mongo/commit/59ae30a629db3ecb675037dc63e06cd4fcae4aaa

Comment by Githook User [ 30/Sep/15 ]

Author:

{u'username': u'RedBeard0531', u'name': u'Mathias Stearn', u'email': u'mathias@10gen.com'}

Message: SERVER-20638 Reading profiling level shouldn't create databases that don't exist

Setting the profiling level will still create the database.
Branch: master
https://github.com/mongodb/mongo/commit/e5883822fd2fc777fd3e3e6b766f54e2caa8e594

Comment by Daniel Pasette (Inactive) [ 25/Sep/15 ]

In the long term, we should determine where we can change this auto-create "feature" from the rest of the users of OldClientContext. In the short term, we should change the profile cmd to not auto-create the db if it doesn't exist. This can be backported to v3.0

Comment by Cailin Nelson [ 25/Sep/15 ]

Interesting. Do you consider this to be a MongoDB bug?

With the current functionality, it is impossible for an application to ask for the profiling level and guarantee that it is not accidentally creating a database when doing so.

For example, what's going here is that there are two threads:

  • A Java application server, which executes the drop.
  • The Monitoring Agent which a) ask for a list of databases and then b) asks for the profile level on each

... and the drop is happening between the two Monitoring Agent actions. There is no way to coordinate the actions of these two applications.

Comment by Mathias Stearn [ 25/Sep/15 ]

Looks like the actual problem is that profile (along with a few other commands) creates the database if it does not exist. It looks like you are trying to query for the profiling level (from conn30216) which is causing the database to be resurrected.

Comment by Daniel Pasette (Inactive) [ 23/Sep/15 ]

Mathias,
This looks like a race in dropDatabase and listDatabases on the secondary.

Comment by Cailin Nelson [ 23/Sep/15 ]

Here is another example:

Does not exist on the primary:

mmsdev:PRIMARY> db.runCommand({listDatabases:1}).databases.forEach(function (x) {if (x.name == "mmsdbrrd-raw-PT1M-PT50H-20150909T2100"){print(x.name)}})

Does exist on the secondary:

mmsdev:SECONDARY>  db.runCommand({listDatabases:1}).databases.forEach(function (x) {if (x.name == "mmsdbrrd-raw-PT1M-PT50H-20150909T2100"){print(x.name)}})
mmsdbrrd-raw-PT1M-PT50H-20150909T2100

Secondary log for the drop suggests a recreation via the listDatabases command

2015-09-12T00:01:17.302+0000 I COMMAND  [repl writer worker 7] dropDatabase mmsdbrrd-raw-dbs-PT1M-PT50H-20150909T2100 starting
2015-09-12T00:01:17.302+0000 I COMMAND  [conn15082] command admin.$cmd command: listDatabases { listDatabases: 1 } keyUpdates:0 writeConflicts:0 numYields:0 reslen:32362 locks:{ Global: { acquireCount: { r: 798 }, acquireWaitCount: { r: 4 }, timeA
cquiringMicros: { r: 209932 } }, MMAPV1Journal: { acquireCount: { r: 399 } }, Database: { acquireCount: { r: 399 } } } 238ms
2015-09-12T00:01:17.325+0000 I JOURNAL  [repl writer worker 7] journalCleanup...
2015-09-12T00:01:17.325+0000 I JOURNAL  [repl writer worker 7] removeJournalFiles
2015-09-12T00:01:17.328+0000 I JOURNAL  [repl writer worker 7] journalCleanup...
2015-09-12T00:01:17.328+0000 I JOURNAL  [repl writer worker 7] removeJournalFiles
2015-09-12T00:01:17.332+0000 I COMMAND  [repl writer worker 7] dropDatabase mmsdbrrd-raw-dbs-PT1M-PT50H-20150909T2100 finished
2015-09-12T00:01:17.333+0000 I COMMAND  [repl writer worker 7] dropDatabase mmsdbrrd-raw-PT1M-PT50H-20150909T2100 starting
2015-09-12T00:01:17.346+0000 I JOURNAL  [repl writer worker 7] journalCleanup...
2015-09-12T00:01:17.346+0000 I JOURNAL  [repl writer worker 7] removeJournalFiles
2015-09-12T00:01:17.350+0000 I JOURNAL  [repl writer worker 7] journalCleanup...
2015-09-12T00:01:17.350+0000 I JOURNAL  [repl writer worker 7] removeJournalFiles
2015-09-12T00:01:17.354+0000 I COMMAND  [repl writer worker 7] dropDatabase mmsdbrrd-raw-PT1M-PT50H-20150909T2100 finished
2015-09-12T00:01:17.359+0000 I INDEX    [repl writer worker 7] allocating new ns file /data/mmsdev_1/mmsdbrrd-raw-dbs-PT1H-PT1536H-20150919T1200.ns, filling with zeroes...
2015-09-12T00:01:17.443+0000 I STORAGE  [FileAllocator] allocating new datafile /data/mmsdev_1/mmsdbrrd-raw-dbs-PT1H-PT1536H-20150919T1200.0, filling with zeroes...
2015-09-12T00:01:17.446+0000 I STORAGE  [FileAllocator] done allocating datafile /data/mmsdev_1/mmsdbrrd-raw-dbs-PT1H-PT1536H-20150919T1200.0, size: 64MB,  took 0.002 secs
2015-09-12T00:01:17.456+0000 I INDEX    [repl writer worker 7] allocating new ns file /data/mmsdev_1/mmsdbrrd-raw-PT1H-PT1536H-20150919T1200.ns, filling with zeroes...
2015-09-12T00:01:17.600+0000 I STORAGE  [FileAllocator] allocating new datafile /data/mmsdev_1/mmsdbrrd-raw-PT1H-PT1536H-20150919T1200.0, filling with zeroes...
2015-09-12T00:01:17.603+0000 I STORAGE  [FileAllocator] done allocating datafile /data/mmsdev_1/mmsdbrrd-raw-PT1H-PT1536H-20150919T1200.0, size: 64MB,  took 0.002 secs
2015-09-12T00:01:17.612+0000 I COMMAND  [conn15082] command 53f63334e4b0bc85024f1855_A.$cmd command: profile { profile: -1 } keyUpdates:0 writeConflicts:0 numYields:0 reslen:58 locks:{ Global: { acquireCount: { r: 1, w: 1 }, acquireWaitCount: { r: 1 }, timeAcquiringMicros: { r: 156732 } }, MMAPV1Journal: { acquireCount: { w: 1 } }, Database: { acquireCount: { W: 1 } } } 156ms
2015-09-12T00:01:17.732+0000 I INDEX    [conn15082] allocating new ns file /data/mmsdev_1/mmsdbrrd-raw-PT1M-PT50H-20150909T2100.ns, filling with zeroes...
2015-09-12T00:01:17.805+0000 I STORAGE  [FileAllocator] allocating new datafile /data/mmsdev_1/mmsdbrrd-raw-PT1M-PT50H-20150909T2100.0, filling with zeroes...
2015-09-12T00:01:17.808+0000 I STORAGE  [FileAllocator] done allocating datafile /data/mmsdev_1/mmsdbrrd-raw-PT1M-PT50H-20150909T2100.0, size: 64MB,  took 0.002 secs
2015-09-12T00:01:17.896+0000 I INDEX    [conn15082] allocating new ns file /data/mmsdev_1/mmsdbrrd-raw-dbs-PT1M-PT50H-20150909T2100.ns, filling with zeroes...
2015-09-12T00:01:17.994+0000 I STORAGE  [FileAllocator] allocating new datafile /data/mmsdev_1/mmsdbrrd-raw-dbs-PT1M-PT50H-20150909T2100.0, filling with zeroes...
2015-09-12T00:01:17.996+0000 I STORAGE  [FileAllocator] done allocating datafile /data/mmsdev_1/mmsdbrrd-raw-dbs-PT1M-PT50H-20150909T2100.0, size: 64MB,  took 0.002 secs
2015-09-12T00:01:18.022+0000 I COMMAND  [conn15082] command mmsdbrrd-raw-dbs-PT1M-PT50H-20150909T2100.$cmd command: profile { profile: -1 } keyUpdates:0 writeConflicts:0 numYields:0 reslen:58 locks:{ Global: { acquireCount: { r: 1, w: 1 } }, MMAPV1Journal: { acquireCount: { w: 5 } }, Database: { acquireCount: { W: 1 } }, Metadata: { acquireCount: { W: 1 } } } 127ms
2015-09-12T00:01:20.391+0000 I NETWORK  [conn21511] end connection 10.167.150.115:33529 (53 connections now open)

Generated at Thu Feb 08 03:54:49 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.