[SERVER-10200] aggregation/testbugs.js (server5932.js) failed on Nightly Windows 64-bit DEBUG, Linux 32-bit Created: 13/Jul/13  Updated: 11/Jul/16  Resolved: 17/Jul/13

Status: Closed
Project: Core Server
Component/s: Testing Infrastructure
Affects Version/s: None
Fix Version/s: 2.5.1

Type: Bug Priority: Critical - P2
Reporter: Matt Kangas Assignee: Mathias Stearn
Resolution: Done Votes: 0
Labels: buildbot
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

buildbot Nightly Windows 64-bit DEBUG
buildbot Linux 32-bit debug


Operating System: ALL
Participants:

 Description   

Mathias - memory allocation failure during testbugs.js. Should we handle memory pressure more gracefully, or is crashing the right thing to do in this case?

Nightly Windows 64-bit DEBUG Build #394 July 12 (rev 42b8033f86de)

http://buildbot.mongodb.org/builders/Nightly%20Windows%2064-bit%20DEBUG/builds/394/steps/test_15/logs/stdio
http://buildlogs.mongodb.org/Nightly%20Windows%2064-bit%20DEBUG/builds/394/test/aggregation/testbugs.js

	
Sat Jul 13 08:37:39.998 [clientcursormon] mem (MB) res:65 virt:821
Sat Jul 13 08:37:39.998 [clientcursormon]  mapped (incl journal view):640
Sat Jul 13 08:37:39.998 [clientcursormon]  connections:1
2013-07-13 08:39:18 EDT	
Sat Jul 13 08:39:18.076 [conn6] 
Sat Jul 13 08:39:18.076 DBClientCursor::init call() failed
---- query BSON
{ "noSuchField" : { "$exists" : true } }
----
Sat Jul 13 08:39:18.076 [conn6] 
---- sort BSON
{}
----
Sat Jul 13 08:39:18.076 [conn6] 
---- fullName
qa.server5932
----
Sat Jul 13 08:39:18.076 [conn6] ERROR: Uncaught std::exception: bad allocation, terminating
dbexit: Sat Jul 13 08:39:18.076 [conn6] shutdown: going to close listening sockets...
Sat Jul 13 08:39:18.076 [conn6] closing listening socket: 460
Sat Jul 13 08:39:18.076 [conn6] closing listening socket: 536
Sat Jul 13 08:39:18.076 [conn6] shutdown: going to flush diaglog...
Sat Jul 13 08:39:18.076 [conn6] shutdown: going to close sockets...
Sat Jul 13 08:39:18.076 [conn6] shutdown: waiting for fs preallocator...
Sat Jul 13 08:39:18.076 [conn6] shutdown: lock for final commit...
Sat Jul 13 08:39:18.076 [conn6] shutdown: final commit...
Sat Jul 13 08:39:18.092 [initandlisten] now exiting
dbexit: ; exiting immediatelySat Jul 13 08:39:18.092 [initandlisten] thread initandlisten stack usage was 38472 bytes,  which is the most so far
Sat Jul 13 08:39:18.092 Error: error doing query: failed at src/mongo/shell/collection.js:54
Sat Jul 13 08:39:18.107 Error: error loading js file: jstests/aggregation/bugs/server5932.js at D:\slave\Windows_64bit_DEBUG_Nightly\mongo\jstests\aggregation\testbugs.js:16
failed to load: D:\slave\Windows_64bit_DEBUG_Nightly\mongo\jstests\aggregation\testbugs.js



 Comments   
Comment by auto [ 15/Jul/13 ]

Author:

{u'username': u'RedBeard0531', u'name': u'Mathias Stearn', u'email': u'mathias@10gen.com'}

Message: SERVER-5932 Fix use after free

BuildBot ticket: SERVER-10200
Branch: master
https://github.com/mongodb/mongo/commit/8c0ff93cc3f7088794f1d28322fe29702998165d

Comment by Tad Marshall [ 14/Jul/13 ]

Here's the diff for the patch I made that avoided the bad_alloc:

@@ -78,6 +78,7 @@ namespace mongo {
         fassert(16960, !cursor->c()->requiresLock());
 
         try {
+            string cursorNs(cursor->ns());
             // can't use result BSONObjBuilder directly since it won't handle exceptions correctly.
             BSONArrayBuilder resultsArray;
             const int byteLimit = MaxBytesToReturnToClientAtOnce;
@@ -100,7 +101,7 @@ namespace mongo {
 
             BSONObjBuilder cursorObj(result.subobjStart("cursor"));
             cursorObj.append("id", id);
-            cursorObj.append("ns", cursor->ns());
+            cursorObj.append("ns", cursorNs);
             cursorObj.append("firstBatch", resultsArray.arr());
             cursorObj.done();
         }

Comment by Tad Marshall [ 14/Jul/13 ]

Here's the call stack from the Visual Studio debugger when the bad_alloc is thrown in the 64-bit debug Windows version:

>Debug.ListCallStack
 Index  Function
--------------------------------------------------------------------------------
 1      KernelBase.dll!RaiseException() 
 2      mongod.exe!_CxxThrowException(void * pExceptionObject=0x00000000078ab748, const _s__ThrowInfo * pThrowInfo=0x0000000141f193d8) 
*3      mongod.exe!operator new(unsigned __int64 size=0xfeeefeeefeeefef0) 
 4      mongod.exe!std::_Allocate<char>(unsigned __int64 _Count=0xfeeefeeefeeefef0, char * __formal=0x0000000000000000) 
 5      mongod.exe!std::allocator<char>::allocate(unsigned __int64 _Count=0xfeeefeeefeeefef0) 
 6      mongod.exe!std::basic_string<char,std::char_traits<char>,std::allocator<char> >::_Copy(unsigned __int64 _Newsize=0xfeeefeeefeeefeee, unsigned __int64 _Oldlen=0x0000000000000000) 
 7      mongod.exe!std::basic_string<char,std::char_traits<char>,std::allocator<char> >::_Grow(unsigned __int64 _Newsize=0xfeeefeeefeeefeee, bool _Trim=false) 
 8      mongod.exe!std::basic_string<char,std::char_traits<char>,std::allocator<char> >::assign(const std::basic_string<char,std::char_traits<char>,std::allocator<char> > & _Right=<Bad Ptr>, unsigned __int64 _Roff=0x0000000000000000, unsigned __int64 _Count=0xffffffffffffffff) 
 9      mongod.exe!std::basic_string<char,std::char_traits<char>,std::allocator<char> >::basic_string<char,std::char_traits<char>,std::allocator<char> >(const std::basic_string<char,std::char_traits<char>,std::allocator<char> > & _Right=<Bad Ptr>) 
 10     mongod.exe!mongo::ClientCursor::ns() 
 11     mongod.exe!mongo::handleCursorCommand(__int64 id=0x0000000000000000, mongo::BSONObj & cmdObj={...}, mongo::BSONObjBuilder & result={...}) 
 12     mongod.exe!mongo::PipelineCommand::run(const std::basic_string<char,std::char_traits<char>,std::allocator<char> > & db="test", mongo::BSONObj & cmdObj={...}, int options=0x00000000, std::basic_string<char,std::char_traits<char>,std::allocator<char> > & errmsg="", mongo::BSONObjBuilder & result={...}, bool fromRepl=false) 
 13     mongod.exe!mongo::_execCommand(mongo::Command * c=0x0000000141f58498, const std::basic_string<char,std::char_traits<char>,std::allocator<char> > & dbname="test", mongo::BSONObj & cmdObj={...}, int queryOptions=0x00000000, std::basic_string<char,std::char_traits<char>,std::allocator<char> > & errmsg="", mongo::BSONObjBuilder & result={...}, bool fromRepl=false) 
 14     mongod.exe!mongo::Command::execCommand(mongo::Command * c=0x0000000141f58498, mongo::Client & client={...}, int queryOptions=0x00000000, const char * cmdns=0x0000000002006404, mongo::BSONObj & cmdObj={...}, mongo::BSONObjBuilder & result={...}, bool fromRepl=false) 
 15     mongod.exe!mongo::_runCommands(const char * ns=0x0000000002006404, mongo::BSONObj & _cmdobj={...}, mongo::_BufBuilder<mongo::TrivialAllocator> & b={...}, mongo::BSONObjBuilder & anObjBuilder={...}, bool fromRepl=false, int queryOptions=0x00000000) 
 16     mongod.exe!mongo::runCommands(const char * ns=0x0000000002006404, mongo::BSONObj & jsobj={...}, mongo::CurOp & curop={...}, mongo::_BufBuilder<mongo::TrivialAllocator> & b={...}, mongo::BSONObjBuilder & anObjBuilder={...}, bool fromRepl=false, int queryOptions=0x00000000) 
 17     mongod.exe!mongo::runQuery(mongo::Message & m={...}, mongo::QueryMessage & q={...}, mongo::CurOp & curop={...}, mongo::Message & result={...}) 
 18     mongod.exe!mongo::receivedQuery(mongo::Client & c={...}, mongo::DbResponse & dbresponse={...}, mongo::Message & m={...}) 
 19     mongod.exe!mongo::assembleResponse(mongo::Message & m={...}, mongo::DbResponse & dbresponse={...}, const mongo::HostAndPort & remote={...}) 
 20     mongod.exe!mongo::MyMessageHandler::process(mongo::Message & m={...}, mongo::AbstractMessagingPort * port=0x0000000001fc3190, mongo::LastError * le=0x0000000001ebf150) 
 21     mongod.exe!mongo::PortMessageServer::handleIncomingMsg(void * arg=0x0000000001fc3650) 
 22     mongod.exe!boost::_bi::list1<boost::_bi::value<mongo::PortMessageServer::HandleIncomingMsgParam * __ptr64> >::operator()<void * __ptr64,void * __ptr64 (__cdecl*)(void * __ptr64),boost::_bi::list0>(boost::_bi::type<void *> __formal={...}, void * (void *)* & f=0x000000013fbe378a, boost::_bi::list0 & a={...}, boost::_bi::type<void *> __formal={...}) 
 23     mongod.exe!boost::_bi::bind_t<void * __ptr64,void * __ptr64 (__cdecl*)(void * __ptr64),boost::_bi::list1<boost::_bi::value<mongo::PortMessageServer::HandleIncomingMsgParam * __ptr64> > >::operator()() 
 24     mongod.exe!boost::detail::thread_data<boost::_bi::bind_t<void * __ptr64,void * __ptr64 (__cdecl*)(void * __ptr64),boost::_bi::list1<boost::_bi::value<mongo::PortMessageServer::HandleIncomingMsgParam * __ptr64> > > >::run() 
 25     mongod.exe!boost::`anonymous namespace'::thread_start_function(void * param=0x00000000004097d0) 
 26     mongod.exe!_callthreadstartex() 
 27     mongod.exe!_threadstartex(void * ptd=0x0000000001ebee20) 
 28     kernel32.dll!BaseThreadInitThunk() 
 29     ntdll.dll!RtlUserThreadStart() 
 
>

Note the _Newsize=0xfeeefeeefeeefeee in the call to Grow() in line 7. The value 0xfeee is written to freed blocks in the debug heap, so it looks like we're trying to use freed memory.

In this code (src/mongo/db/commands/pipeline_command.cpp lines 92 to 105)

            // The initial ok() on a cursor may be very expensive so we don't do it when batchSize
            // is 0 since that indicates a desire for a fast return.
            if (batchSize != 0 && !cursor->ok()) {
                // There is no more data. Kill the cursor.
                pin.release();
                ClientCursor::erase(id);
                id = 0;
            }
 
            BSONObjBuilder cursorObj(result.subobjStart("cursor"));
            cursorObj.append("id", id);
            cursorObj.append("ns", cursor->ns());
            cursorObj.append("firstBatch", resultsArray.arr());
            cursorObj.done();
        }

in the call to cursor->ns() [line 103 in the file, line 12 above], cursor has been deleted, causing cursor._ns to have a "size" of 0xfeeefeeefeeefeee.

I verified that making a local copy of the cursor's namespace before the cursor is deleted and using it instead of the call to cursor->ns() avoids the bad_alloc and makes the test pass.

In case there are other issues revealed in looking at this code, I'll leave it to Mathias to decide how to fix it.

Comment by Tad Marshall [ 14/Jul/13 ]

Similar failure in Nightly Linux RHEL 32-bit; a segfault this time:
http://buildbot.mongodb.org/builders/Nightly%20Linux%20RHEL%2032-bit/builds/580/steps/test_13/logs/stdio
http://buildlogs.mongodb.org/Nightly%20Linux%20RHEL%2032-bit/builds/580/test/aggregation/testbugs.js

 *******************************************
         Test : jstests/aggregation/bugs/server5932.js ...
Sun Jul 14 04:41:15.599 [conn6] CMD: drop aggdb.server5932
Sun Jul 14 04:41:15.602 [conn6] build index aggdb.server5932 { _id: 1 }
Sun Jul 14 04:41:15.602 [conn6] build index done.  scanned 0 total records. 0 secs
Sun Jul 14 04:41:17.566 [clientcursormon] mem (MB) res:176 virt:500
Sun Jul 14 04:41:17.566 [clientcursormon]  mapped:352
Sun Jul 14 04:41:17.566 [clientcursormon]  connections:1
Invalid access at address: 0xfffffffc from thread: conn6
Sun Jul 14 04:41:23.645 DBClientCursor::init call() failed
Got signal: 11 (Segmentation fault).
Backtrace:
0x8b532bd 0x82ddb23 0x82de6fb 0x877440 0x80f62f 0x84f3139 0x84f58fe 0x853d683 0x8540c6d 0x85424b9 0x874ffcc 0x8755ebf 0x86b11d5 0x86b4ba3 0x8300d75 0x8b3f3e9 0x3a3869 0x30d4de 
 /data/buildslaves/Linux_RHEL_32bit_Nightly/mongo/mongod(_ZN5mongo15printStackTraceERSo+0x2d) [0x8b532bd]
 /data/buildslaves/Linux_RHEL_32bit_Nightly/mongo/mongod(_ZN5mongo10abruptQuitEi+0x3c3) [0x82ddb23]
 /data/buildslaves/Linux_RHEL_32bit_Nightly/mongo/mongod(_ZN5mongo24abruptQuitWithAddrSignalEiP7siginfoPv+0x29b) [0x82de6fb]
 [0x877440]
 /usr/lib/libstdc++.so.6(_ZNSsC1ERKSs+0x1f) [0x80f62f]
 /data/buildslaves/Linux_RHEL_32bit_Nightly/mongo/mongod [0x84f3139]
 /data/buildslaves/Linux_RHEL_32bit_Nightly/mongo/mongod(_ZN5mongo15PipelineCommand3runERKSsRNS_7BSONObjEiRSsRNS_14BSONObjBuilderEb+0x63e) [0x84f58fe]
 /data/buildslaves/Linux_RHEL_32bit_Nightly/mongo/mongod(_ZN5mongo12_execCommandEPNS_7CommandERKSsRNS_7BSONObjEiRSsRNS_14BSONObjBuilderEb+0x53) [0x853d683]
 /data/buildslaves/Linux_RHEL_32bit_Nightly/mongo/mongod(_ZN5mongo7Command11execCommandEPS0_RNS_6ClientEiPKcRNS_7BSONObjERNS_14BSONObjBuilderEb+0xbcd) [0x8540c6d]
 /data/buildslaves/Linux_RHEL_32bit_Nightly/mongo/mongod(_ZN5mongo12_runCommandsEPKcRNS_7BSONObjERNS_11_BufBuilderINS_16TrivialAllocatorEEERNS_14BSONObjBuilderEbi+0x669) [0x85424b9]
 /data/buildslaves/Linux_RHEL_32bit_Nightly/mongo/mongod(_ZN5mongo11runCommandsEPKcRNS_7BSONObjERNS_5CurOpERNS_11_BufBuilderINS_16TrivialAllocatorEEERNS_14BSONObjBuilderEbi+0x4c) [0x874ffcc]
 /data/buildslaves/Linux_RHEL_32bit_Nightly/mongo/mongod(_ZN5mongo8runQueryERNS_7MessageERNS_12QueryMessageERNS_5CurOpES1_+0x102f) [0x8755ebf]
 /data/buildslaves/Linux_RHEL_32bit_Nightly/mongo/mongod [0x86b11d5]
 /data/buildslaves/Linux_RHEL_32bit_Nightly/mongo/mongod(_ZN5mongo16assembleResponseERNS_7MessageERNS_10DbResponseERKNS_11HostAndPortE+0x5c3) [0x86b4ba3]
 /data/buildslaves/Linux_RHEL_32bit_Nightly/mongo/mongod(_ZN5mongo16MyMessageHandler7proce
Sun Jul 14 04:41:23.647 Error: error doing query: failed at src/mongo/shell/query.js:81
Sun Jul 14 04:41:23.648 Error: error loading js file: jstests/aggregation/bugs/server5932.js at /data/buildslaves/Linux_RHEL_32bit_Nightly/mongo/jstests/aggregation/testbugs.js:16
failed to load: /data/buildslaves/Linux_RHEL_32bit_Nightly/mongo/jstests/aggregation/testbugs.js

Comment by Tad Marshall [ 13/Jul/13 ]

This repros for me locally in current code (commit 458cfbf633d94357212d03587e6544d296df58d5) [debug Windows 64-bit build tested on Windows 7 using command line "buildscripts\smoke.py jstests\aggregation\testbugs.js"].

It would be nice if the code printed a stack trace before terminating, but since the bad_alloc exception was "uncaught" it might not be very helpful.

Generated at Thu Feb 08 03:22:33 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.