[SERVER-14651] listDatabases command errors under high database create/delete load Created: 22/Jul/14  Updated: 15/May/15  Resolved: 15/May/15

Status: Closed
Project: Core Server
Component/s: Storage
Affects Version/s: 2.2.7, 2.4.10, 2.6.3
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: David Golden Assignee: Unassigned
Resolution: Won't Fix Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
is duplicated by SERVER-15377 Mongo.getDatabaseNames() occasionally... Closed
Related
Operating System: ALL
Steps To Reproduce:

[PERL-387] has example code that replicates the problem.

In short: have several threads/processes repeatedly and rapidly create a database, add a document to a collection, and then drop the database. In another thread/process, call "listDatabases" once a second until the error occurs.

Participants:

 Description   

If many databases are rapidly created and deleted, the listDatabases command can fail with error 15927 ("can't open database in a read lock..."), which apparently is from database_holder.cpp.

I think listDatabases needs to catch that error and either retry or just omit the offending database in the list it returns. Otherwise, a transient error from a single database prevents the rest of the valid results from being returned.



 Comments   
Comment by Daniel Pasette (Inactive) [ 15/May/15 ]

fixed in 3.0+; won't backport

Comment by David Golden [ 11/Aug/14 ]

I tested it on 2.7.4 and did not see the error. I suspect it was "fixed" somewhere in the refactoring for pluggable storage engines.

Comment by Ramon Fernandez Marina [ 11/Aug/14 ]

Thanks for the repro eric.daniels@10gen.com, I'm able to reproduce the problem in 2.6.4.

However I'm not able to reproduce on 2.7.4. david.golden, have you tested 2.7.4 with the Perl reproducer?

Comment by Eric Daniels (Inactive) [ 11/Aug/14 ]

Here's a reproduction in python using pymongo:

from os import fork, getpgrp, killpg
from signal import SIGTERM
from time import sleep
 
import pymongo
 
if __name__ == '__main__':
 
    client = pymongo.MongoClient()
 
    for i in range(0, 10):
 
        if fork() != 0:
 
            db_name = 'testdb_{0}'.format(i)
            db = client[db_name]
 
            while True:
 
                db['porquoi'].insert({'a_doc': 'por quoi'})
                client.drop_database(db_name)
 
    while True:
 
        print 'sleeping...'
 
        try:
            client['admin'].command('listDatabases')
 
        except pymongo.errors.OperationFailure as e:
            print e
            killpg(getpgrp(), SIGTERM)
        sleep(1)
 

Comment by David Golden [ 08/Aug/14 ]

Were you using the latest release of the Perl driver? It has a workaround so you won't see it. You need to use v0.704.2.0 or earlier.

I tried it again on 2.6.4-rc1 (from perl driver commit 8f961fb right before the fix) and got the same exception:

2014-08-08T16:28:53.485-0400 [conn11] opening db:  testdb_4
2014-08-08T16:28:53.485-0400 [conn11] Assertion: 15927:can't open database in a read lock. if db was just closed, consider retrying the query. might otherwise indicate an internal error
2014-08-08T16:28:53.492-0400 [conn11] 0x1006c012b 0x1006774e2 0x100667e9f 0x1000efd30 0x100107d54 0x100107fa5 0x100108441 0x1001c1921 0x1001bb3a5 0x1001bbfe2 0x1001bd28c 0x1003d5c1f 0x1
002a56bb 0x100006b34 0x100684d91 0x1006f4575 0x7fff8882d899 0x7fff8882d72a 0x7fff88831fc9
 0   mongod                              0x00000001006c012b _ZN5mongo15printStackTraceERSo + 43
 1   mongod                              0x00000001006774e2 _ZN5mongo10logContextEPKc + 114
 2   mongod                              0x0000000100667e9f _ZN5mongo11msgassertedEiPKc + 255
 3   mongod                              0x00000001000efd30 _ZN5mongo14DatabaseHolder11getOrCreateERKSsS2_Rb + 674
 4   mongod                              0x0000000100107d54 _ZN5mongo6Client7Context11_finishInitEv + 78
 5   mongod                              0x0000000100107fa5 _ZN5mongo6Client7ContextC2ERKSsS3_b + 117
 6   mongod                              0x0000000100108441 _ZN5mongo6Client11ReadContextC2ERKSsS3_ + 731
 7   mongod                              0x00000001001c1921 _ZN5mongo16CmdListDatabases3runERKSsRNS_7BSONObjEiRSsRNS_14BSONObjBuilderEb + 629
 8   mongod                              0x00000001001bb3a5 _ZN5mongo12_execCommandEPNS_7CommandERKSsRNS_7BSONObjEiRSsRNS_14BSONObjBuilderEb + 37
 9   mongod                              0x00000001001bbfe2 _ZN5mongo7Command11execCommandEPS0_RNS_6ClientEiPKcRNS_7BSONObjERNS_14BSONObjBuilderEb + 2566
 10  mongod                              0x00000001001bd28c _ZN5mongo12_runCommandsEPKcRNS_7BSONObjERNS_11_BufBuilderINS_16TrivialAllocatorEEERNS_14BSONObjBuilderEbi + 1388
 11  mongod                              0x00000001003d5c1f _ZN5mongo11newRunQueryERNS_7MessageERNS_12QueryMessageERNS_5CurOpES1_ + 1615
 12  mongod                              0x00000001002a56bb _ZN5mongo16assembleResponseERNS_7MessageERNS_10DbResponseERKNS_11HostAndPortE + 2443
 13  mongod                              0x0000000100006b34 _ZN5mongo16MyMessageHandler7processERNS_7MessageEPNS_21AbstractMessagingPortEPNS_9LastErrorE + 308
 14  mongod                              0x0000000100684d91 _ZN5mongo17PortMessageServer17handleIncomingMsgEPv + 1681
 15  mongod                              0x00000001006f4575 thread_proxy + 229
 16  libsystem_pthread.dylib             0x00007fff8882d899 _pthread_body + 138
 17  libsystem_pthread.dylib             0x00007fff8882d72a _pthread_struct_init + 0
 18  libsystem_pthread.dylib             0x00007fff88831fc9 thread_start + 13
2014-08-08T16:28:53.492-0400 [conn11] command admin.$cmd command: listDatabases { listDatabases: 1 } keyUpdates:0 numYields:0 locks(micros) W:270 r:7686 reslen:201 3475ms

You can download http://www.cpan.org/authors/id/M/MO/MONGODB/MongoDB-v0.704.2.0.tar.gz and install it. Or if you have all prereqs installed (including Proc::Fork), you can probably repo from the perl repository directory with something like this:

git checkout 8f961fb
perl Makefile.PL
make -j9
perl -Mblib path/to/repo.pl

Comment by Eric Daniels (Inactive) [ 22/Jul/14 ]

If it helps, this is the associated stack trace:

 0   mongod                              0x00000001006b90cb _ZN5mongo15printStackTraceERSo + 43
 1   mongod                              0x0000000100670be2 _ZN5mongo10logContextEPKc + 114
 2   mongod                              0x000000010066167f _ZN5mongo11msgassertedEiPKc + 255
 3   mongod                              0x00000001000ed220 _ZN5mongo14DatabaseHolder11getOrCreateERKSsS2_Rb + 674
 4   mongod                              0x0000000100105114 _ZN5mongo6Client7Context11_finishInitEv + 78
 5   mongod                              0x0000000100105365 _ZN5mongo6Client7ContextC2ERKSsS3_b + 117
 6   mongod                              0x0000000100105801 _ZN5mongo6Client11ReadContextC2ERKSsS3_ + 731
 7   mongod                              0x00000001001be3c1 _ZN5mongo16CmdListDatabases3runERKSsRNS_7BSONObjEiRSsRNS_14BSONObjBuilderEb + 629
 8   mongod                              0x00000001001b7f85 _ZN5mongo12_execCommandEPNS_7CommandERKSsRNS_7BSONObjEiRSsRNS_14BSONObjBuilderEb + 37
 9   mongod                              0x00000001001b8bbe _ZN5mongo7Command11execCommandEPS0_RNS_6ClientEiPKcRNS_7BSONObjERNS_14BSONObjBuilderEb + 2562
 10  mongod                              0x00000001001b9d2c _ZN5mongo12_runCommandsEPKcRNS_7BSONObjERNS_11_BufBuilderINS_16TrivialAllocatorEEERNS_14BSONObjBuilderEbi + 1388
 11  mongod                              0x00000001003d1eef _ZN5mongo11newRunQueryERNS_7MessageERNS_12QueryMessageERNS_5CurOpES1_ + 1615
 12  mongod                              0x00000001002a1f00 _ZN5mongo16assembleResponseERNS_7MessageERNS_10DbResponseERKNS_11HostAndPortE + 1968
 13  mongod                              0x00000001000065f4 _ZN5mongo16MyMessageHandler7processERNS_7MessageEPNS_21AbstractMessagingPortEPNS_9LastErrorE + 308
 14  mongod                              0x000000010067e491 _ZN5mongo17PortMessageServer17handleIncomingMsgEPv + 1681
 15  mongod                              0x00000001006ed515 thread_proxy + 229
 16  libsystem_pthread.dylib             0x00007fff8d2ea899 _pthread_body + 138
 17  libsystem_pthread.dylib             0x00007fff8d2ea72a _pthread_struct_init + 0
 18  libsystem_pthread.dylib             0x00007fff8d2eefc9 thread_start + 13

Comment by David Golden [ 22/Jul/14 ]

I replicated this under 2.6.3, 2.4.10 and 2.2.7. I did not see the error on 2.0.9 in ~ 20 seconds of observation. (The others fail within seconds.)

UPDATE:

  • The errors were observed against a standalone mongod and against a replica set
  • I could NOT replicate the errors against mongos for 2.6.3, 2.4.10 or 2.2.7.
Generated at Thu Feb 08 03:35:32 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.