[SERVER-5101] Test failure in closeall.js Created: 26/Feb/12  Updated: 11/Jul/16  Resolved: 17/Jul/12

Status: Closed
Project: Core Server
Component/s: Concurrency
Affects Version/s: None
Fix Version/s: 2.1.2, 2.2.0-rc0

Type: Bug Priority: Major - P3
Reporter: Ian Whalen (Inactive) Assignee: Aaron Staple
Resolution: Done Votes: 0
Labels: buildbot
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

OS X 10.5 64-bit DUR OFF


Issue Links:
Related
related to SERVER-5724 dbtemprelease's destructor can fail t... Closed
related to SERVER-5183 Get rid of setNoJournal Closed
Operating System: ALL
Participants:

 Description   

 m30001| Sun Feb 26 04:48:59 [conn2] DatabaseHolder::closeAll path:/data/db/closeall
 m30001| Sun Feb 26 04:48:59 [conn3] ClientCursor::find(): cursor not found in map 8678329115160354575 (ok after a drop)
 m30001| opening db:  local
 m30001| Sun Feb 26 04:48:59 [conn3] Assertion: 15927:can't open database in a read lock. if db was just closed, consider retrying the query. might otherwise indicate an internal error
 m30001| 0x10008e4ab 0x10034f55f 0x100250d21 0x10035b6b9 0x100355fff 0x100356b7d 0x100356c91 0x100393a60 0x100346c9e 0x100348302 0x1000ad6af 0x100601893 0x1015e91f9 0x7fff8a34efd6 0x7fff8a34ee89 
 m30001|  0   mongod                              0x000000010008e4ab _ZN5mongo11msgassertedEiPKc + 363
 m30001|  1   mongod                              0x000000010034f55f _ZN5mongo14DatabaseHolder11getOrCreateERKSsS2_Rb + 2543
 m30001|  2   mongod                              0x0000000100250d21 _ZN5mongo6Client7Context11_finishInitEb + 113
 m30001|  3   mongod                              0x000000010035b6b9 _ZN5mongo17dbtempreleasecondD1Ev + 137
 m30001|  4   mongod                              0x0000000100355fff _ZN5mongo12ClientCursor11staticYieldEiRKNS_10StringDataEPNS_6RecordE + 495
 m30001|  5   mongod                              0x0000000100356b7d _ZN5mongo12ClientCursor5yieldEiPNS_6RecordE + 109
 m30001|  6   mongod                              0x0000000100356c91 _ZN5mongo12ClientCursor14yieldSometimesENS0_11RecordNeedsEPb + 209
 m30001|  7   mongod                              0x0000000100393a60 _ZN5mongo14processGetMoreEPKcixRNS_5CurOpEiRb + 1712
 m30001|  8   mongod                              0x0000000100346c9e _ZN5mongo15receivedGetMoreERNS_10DbResponseERNS_7MessageERNS_5CurOpE + 814
 m30001|  9   mongod                              0x0000000100348302 _ZN5mongo16assembleResponseERNS_7MessageERNS_10DbResponseERKNS_11HostAndPortE + 3234
 m30001|  10  mongod                              0x00000001000ad6af _ZN5mongo16MyMessageHandler7processERNS_7MessageEPNS_21AbstractMessagingPortEPNS_9LastErrorE + 287
 m30001|  11  mongod                              0x0000000100601893 _ZN5mongo3pms9threadRunEPNS_13MessagingPortE + 707
 m30001|  12  libboost_thread-mt.dylib            0x00000001015e91f9 thread_proxy + 137
 m30001|  13  libSystem.B.dylib                   0x00007fff8a34efd6 _pthread_start + 331
 m30001|  14  libSystem.B.dylib                   0x00007fff8a34ee89 thread_start + 13
 m30001| Sun Feb 26 04:48:59 [conn3]  local.oplog.$main Assertion failure _c->_pinValue >= 100 src/mongo/db/ops/../clientcursor.h 91
 m30001| 0x10008a5ca 0x10008ded7 0x10039437f 0x100346c9e 0x100348302 0x1000ad6af 0x100601893 0x1015e91f9 0x7fff8a34efd6 0x7fff8a34ee89 
 m30001|  0   mongod                              0x000000010008a5ca _ZN5mongo12sayDbContextEPKc + 154
 m30001|  1   mongod                              0x000000010008ded7 _ZN5mongo8assertedEPKcS1_j + 327
 m30001|  2   mongod                              0x000000010039437f _ZN5mongo14processGetMoreEPKcixRNS_5CurOpEiRb + 4047
 m30001|  3   mongod                              0x0000000100346c9e _ZN5mongo15receivedGetMoreERNS_10DbResponseERNS_7MessageERNS_5CurOpE + 814
 m30001|  4   mongod                              0x0000000100348302 _ZN5mongo16assembleResponseERNS_7MessageERNS_10DbResponseERKNS_11HostAndPortE + 3234
 m30001|  5   mongod                              0x00000001000ad6af _ZN5mongo16MyMessageHandler7processERNS_7MessageEPNS_21AbstractMessagingPortEPNS_9LastErrorE + 287
 m30001|  6   mongod                              0x0000000100601893 _ZN5mongo3pms9threadRunEPNS_13MessagingPortE + 707
 m30001|  7   libboost_thread-mt.dylib            0x00000001015e91f9 thread_proxy + 137
 m30001|  8   libSystem.B.dylib                   0x00007fff8a34efd6 _pthread_start + 331
 m30001|  9   libSystem.B.dylib                   0x00007fff8a34ee89 thread_start + 13
 m30001| Sun Feb 26 04:48:59 [conn3] 
 m30001| 
 m30001| ***aborting after assert() failure as this is a debug/test build
 m30001| 
 m30001| 
 m30001| Sun Feb 26 04:48:59 Got signal: 6 (Abort trap).
 m30001| 
 m30001| Sun Feb 26 04:48:59 Backtrace:
 m30001| 0x100095a5f 0x7fff8a3761ba 0 0x100097ead 0x100099db2 0x10009a24d 0x1000a26df 0x1000951c8 
 m30001|  0   mongod                              0x0000000100095a5f _ZN5mongo10abruptQuitEi + 1071
 m30001|  1   libSystem.B.dylib                   0x00007fff8a3761ba _sigtramp + 26
 m30001|  2   ???                                 0x0000000000000000 0x0 + 0
 m30001|  3   mongod                              0x0000000100097ead _ZN5mongo6listenEi + 477
 m30001|  4   mongod                              0x0000000100099db2 _ZN5mongo14_initAndListenEi + 1298
 m30001|  5   mongod                              0x000000010009a24d _ZN5mongo13initAndListenEi + 29
 m30001|  6   mongod                              0x00000001000a26df main + 33407
 m30001|  7   mongod                              0x00000001000951c8 start + 52
 m30001| 
 m30001| Logstream::get called in uninitialized state
 m30001| Sun Feb 26 04:48:59 ERROR: Client::shutdown not called: initandlisten
 m30002| Sun Feb 26 04:48:59 [replslave] repl: AssertionException dbclient error communicating with server: localhost:30001
 m30002| repl: sleep 2 sec before next pass
Sun Feb 26 04:48:59 DBClientCursor::init call() failed
Sun Feb 26 04:48:59 query failed : admin.$cmd { closeAllDatabases: 1.0 } to: 127.0.0.1:30001
 m30002| Sun Feb 26 04:49:01 [replslave] repl: syncing from host:localhost:30001
 m30002| Sun Feb 26 04:49:02 [replslave] repl: couldn't connect to server localhost:30001
 m30002| Sun Feb 26 04:49:02 [replslave] repl: sleep 3 sec before next pass
 
 
 
FAIL closeall.js closeAllDatabases command invocation threw an exception. i:600
getlasterror:
Sun Feb 26 04:49:05 trying reconnect to 127.0.0.1:30001
 m30002| Sun Feb 26 04:49:05 [replslave] repl: syncing from host:localhost:30001
Sun Feb 26 04:49:05 reconnect 127.0.0.1:30001 failed couldn't connect to server 127.0.0.1:30001
 m30002| Sun Feb 26 04:49:05 [replslave] repl: couldn't connect to server localhost:30001
 m30002| Sun Feb 26 04:49:05 [replslave] repl: sleep 3 sec before next pass
got another exception : Error: error doing query: unknown

http://buildbot.mongodb.org/builders/OS%20X%2010.5%2064-bit%20DUR%20OFF/builds/644/steps/test_5/logs/stdio



 Comments   
Comment by auto [ 05/May/12 ]

Author:

{u'login': u'astaple', u'name': u'Aaron', u'email': u'aaron@10gen.com'}

Message: SERVER-5101 Make ClientCursor::Pin exception safe and remove its manual release mechanism for failed yield recovery.
Branch: master
https://github.com/mongodb/mongo/commit/11491b034fd9d65356e548ba85659fef8e83d2f3

Comment by Aaron Staple [ 30/Apr/12 ]

I'm going to split this into two issues:

1) ClientCursor::Pointer is not exception safe when a cursor is destroyed. (I'm going to let the current ticket SERVER-5101 represent this.)
2) dbtemprelease's destructor can fail to restore state and assert on failure. (I filed SERVER-5724 to represent this.)

Comment by Eric Milkie [ 06/Apr/12 ]

Test hung again:
http://buildbot.mongodb.org/builders/Linux%2064-bit%20debug%20dur%20off/builds/1066/steps/test_8/logs/stdio

I think Aaron's diagnosis above is correct. closeall is freeing a cursor out from underneath a consumer.

Comment by auto [ 23/Mar/12 ]

Author:

{u'login': u'milkie', u'name': u'Eric Milkie', u'email': u'milkie@10gen.com'}

Message: SERVER-5101 better test output so we know exactly which variant we're on
Branch: master
https://github.com/mongodb/mongo/commit/90a52eea10d4010b2721007fb459f7cf9ea721cf

Comment by auto [ 01/Mar/12 ]

Author:

{u'login': u'RedBeard0531', u'name': u'Mathias Stearn', u'email': u'mathias@10gen.com'}

Message: Continue holding lock while finding error location SERVER-5101

This has no impact on the normal case since we return anyway. Current
test fails are indicating that the problem was written with
writeNoJournal, but that looks correct. This should confirm or rule that
out.
Branch: master
https://github.com/mongodb/mongo/commit/2e4f02012cff3e775a8e8ff432b034cbec885817

Comment by Eric Milkie [ 29/Feb/12 ]

also test failures on redhat:
http://buildbot.mongodb.org/builders/Linux%20RHEL%2032-bit/builds/99/steps/test_3/logs/stdio
http://buildbot.mongodb.org/builders/Linux%20RHEL%2064-bit/builds/78/steps/test_3/logs/stdio

Comment by Eric Milkie [ 29/Feb/12 ]

closeall hung on OS X:
http://buildbot.mongodb.org/builders/OS%20X%2010.5%2032-bit/builds/3414/steps/test_3/logs/stdio

Comment by auto [ 28/Feb/12 ]

Author:

{u'login': u'RedBeard0531', u'name': u'Mathias Stearn', u'email': u'mathias@10gen.com'}

Message: Try to fix closeAllDatabases SERVER-5101 SERVER-4947 SERVER-5072
Branch: master
https://github.com/mongodb/mongo/commit/9cf0cc2233478ff41b8d32949fa58108071b0af2

Comment by auto [ 28/Feb/12 ]

Author:

{u'login': u'RedBeard0531', u'name': u'Mathias Stearn', u'email': u'mathias@10gen.com'}

Message: Run all variants of closeall.js everytime SERVER-5101 SERVER-4947 SERVER-5072
Branch: master
https://github.com/mongodb/mongo/commit/59885c8e6978316d59fb6224d41548808bdc7e40

Generated at Thu Feb 08 03:07:54 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.