Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-14257

"remove" command can cause process termination by throwing unhandled exception if profiling is enabled

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • 2.6.4
    • Affects Version/s: 2.6.1
    • Component/s: Write Ops
    • Labels:
    • ALL
    • 0

      Issue Status as of Jul 22, 2014

      The new remove command introduced in 2.6 as part of the write commands can generate an unhandled exception when profiling is enabled. This causes termination of the mongod process.

      Users running with the profiler enabled may experience unexpected process termination when removing documents via the write commands. In replica sets, this can affect quorum and may lead to unavailability of the replica set.

      Disable the profiler via db.setProfilingLevel(0) or use the "remove" legacy write operation instead of the "remove" command.

      Production release versions from 2.6.0 to 2.6.3 are affected by this issue.

      The fix is included in the 2.6.4 production release.

      The remove command was corrected to only throw PageFaultException when a page fault is expected on data or index accesses for the documents to be removed.

      Original description

      When profiling is enabled, the "remove" command incorrectly permits PageFaultException to be thrown during the call to the profiler. If PageFaultException is thrown during this call, the exception is caught by the handler in PortMessageServer::handleIncomingMessage(), which proceeds by calling dbexit().

      Workarounds available for this issue include:

      • Disable the profiler (db.setProfilingLevel(0)).
      • Use the "remove" legacy write op instead of the "remove" command.

      Reproduce by running the following at the shell.

      for (var i=0; i<10000; ++i) {

      Upon encountering this condition, the server outputs the following and exits:

      2014-06-14T15:14:10.104-0400 [conn1] ERROR: Uncaught exception, terminating
      2014-06-14T15:14:10.104-0400 [conn1] dbexit:
      2014-06-14T15:14:10.104-0400 [conn1] shutdown: going to close listening sockets...
      2014-06-14T15:14:10.104-0400 [conn1] closing listening socket: 7
      2014-06-14T15:14:10.104-0400 [conn1] closing listening socket: 8
      2014-06-14T15:14:10.104-0400 [conn1] removing socket file: /tmp/mongodb-27017.sock
      2014-06-14T15:14:10.104-0400 [initandlisten] now exiting
      2014-06-14T15:14:10.104-0400 [conn1] shutdown: going to flush diaglog...
      2014-06-14T15:14:10.104-0400 [conn1] shutdown: going to close sockets...
      2014-06-14T15:14:10.104-0400 [initandlisten] dbexit: ; exiting immediately
      2014-06-14T15:14:10.104-0400 [conn1] shutdown: waiting for fs preallocator...
      2014-06-14T15:14:10.104-0400 [conn1] shutdown: lock for final commit...
      2014-06-14T15:14:10.105-0400 [initandlisten] thread initandlisten stack usage was 28048 bytes,  which is the most so far

      Stack trace for the throw point:

      Catchpoint 1 (exception caught), throw location record.cpp:571, catch location unknown, exception type mongo::PageFaultException
      0x0000000103f87abd in __cxa_throw ()
      (gdb) bt
      #0  0x0000000103f87abd in __cxa_throw ()
      #1  0x0000000100a10044 in mongo::Record::_accessing (this=0x119135318) at record.cpp:571
      #2  0x0000000100344080 in mongo::Record::data (this=0x119135318) at record.h:67
      #3  0x0000000100a54bc3 in mongo::BSONObj::make (r=0x119135318) at pdfile.h:90
      #4  0x00000001001c87b8 in mongo::Collection::docFor (this=0x106b088c0, loc=@0x113129bb0) at collection.cpp:157
      #5  0x00000001001c894b in mongo::Collection::deleteDocument (this=0x106b088c0, loc=@0x113129bb0, cappedOK=true, noWarn=false, deletedId=0x0) at collection.cpp:266
      #6  0x0000000100a43bc8 in mongo::NamespaceDetails::cappedAlloc (this=0x114c8ebec, collection=0x106b088c0, ns=@0x113129ec0, len=308) at cap.cpp:285
      #7  0x0000000100a4afa5 in mongo::NamespaceDetails::_alloc (this=0x114c8ebec, collection=0x106b088c0, ns=@0x113129ec0, len=308) at namespace_details.cpp:375
      #8  0x0000000100a4b393 in mongo::NamespaceDetails::alloc (this=0x114c8ebec, collection=0x106b088c0, ns=@0x113129ec0, lenToAlloc=308) at namespace_details.cpp:189
      #9  0x0000000100a58906 in mongo::CappedRecordStoreV1::allocRecord (this=0x106a1f770, lengthWithHeaders=306, quotaMax=0) at record_store.cpp:264
      #10 0x0000000100a578b7 in mongo::RecordStoreV1Base::insertRecord (this=0x106a1f770, data=0x106b0cc00 "\"\001", len=290, quotaMax=0) at record_store.cpp:94
      #11 0x00000001001c6288 in mongo::Collection::_insertDocument (this=0x106b088c0, docToInsert=@0x11312a418, enforceQuota=false, preGen=0x0) at collection.cpp:232
      #12 0x00000001001c83ce in mongo::Collection::insertDocument (this=0x106b088c0, docToInsert=@0x11312a418, enforceQuota=false, preGen=0x0) at collection.cpp:189
      #13 0x000000010058a557 in _profile (c=@0x106a9c750, currentOp=@0x106bb3800, profileBufBuilder=@0x11312a750) at introspect.cpp:125
      #14 0x000000010058aa8f in mongo::profile (c=@0x106a9c750, op=2006, currentOp=@0x106bb3800) at introspect.cpp:140
      #15 0x000000010034e450 in finishCurrentOp (client=0x106a9c750, currentOp=0x106bb3800, opError=0x0) at batch_executor.cpp:603
      #16 0x0000000100350cd8 in mongo::WriteBatchExecutor::execRemove (this=0x11312b320, removeItem=@0x11312ac60, error=0x11312abe8) at batch_executor.cpp:899
      #17 0x00000001003515e3 in mongo::WriteBatchExecutor::bulkExecute (this=0x11312b320, request=@0x11312b400, upsertedIds=0x11312af20, errors=0x11312af40) at batch_executor.cpp:741
      #18 0x0000000100352bdb in mongo::WriteBatchExecutor::executeBatch (this=0x11312b320, request=@0x11312b400, response=0x11312b370) at batch_executor.cpp:242
      #19 0x000000010035b87d in mongo::WriteCmd::run (this=0x106a4f820, dbName=@0x11312bce0, cmdObj=@0x11312c360, options=0, errMsg=@0x11312bc80, result=@0x11312cca8, fromRepl=false) at write_commands.cpp:144
      #20 0x0000000100374bd6 in mongo::_execCommand (c=0x106a4f820, dbname=@0x11312bce0, cmdObj=@0x11312c360, queryOptions=0, errmsg=@0x11312bc80, result=@0x11312cca8, fromRepl=false) at dbcommands.cpp:1357
      #21 0x00000001003766f5 in mongo::Command::execCommand (c=0x106a4f820, client=@0x106a9c750, queryOptions=0, cmdns=0x106b0c414 "test.$cmd", cmdObj=@0x11312c360, result=@0x11312cca8, fromRepl=false) at dbcommands.cpp:1545
      #22 0x0000000100378bdf in mongo::_runCommands (ns=0x106b0c414 "test.$cmd", _cmdobj=@0x11312daf8, b=@0x11312cd08, anObjBuilder=@0x11312cca8, fromRepl=false, queryOptions=0) at dbcommands.cpp:1675
      #23 0x00000001007cf0b5 in runCommands (ns=0x106b0c414 "test.$cmd", jsobj=@0x11312daf8, curop=@0x106b06800, b=@0x11312cd08, anObjBuilder=@0x11312cca8, fromRepl=false, queryOptions=0) at new_find.cpp:120
      #24 0x00000001007d064a in mongo::newRunQuery (m=@0x11312e990, q=@0x11312dae0, curop=@0x106b06800, result=@0x106a6e370) at new_find.cpp:423
      #25 0x000000010057c3fb in receivedQuery (c=@0x106a9c750, dbresponse=@0x11312e640, m=@0x11312e990) at instance.cpp:269
      #26 0x00000001005804f7 in mongo::assembleResponse (m=@0x11312e990, dbresponse=@0x11312e640, remote=@0x11312e690) at instance.cpp:434
      #27 0x000000010000bec9 in mongo::MyMessageHandler::process (this=0x106a10100, m=@0x11312e990, port=0x106a1f900, le=0x106a1f180) at db.cpp:202
      #28 0x0000000100d8d9de in mongo::PortMessageServer::handleIncomingMsg (arg=0x106ae98d0) at message_server_port.cpp:209
      #29 0x0000000100d8bdd1 in boost::_bi::list1<boost::_bi::value<mongo::PortMessageServer::HandleIncomingMsgParam*> >::operator()<void*, void* (*)(void*), boost::_bi::list0> (this=0x106a9ebf0, f=@0x106a9ebe8, a=@0x11312ee10, unnamed_arg=0) at bind.hpp:243
      #30 0x0000000100d8be36 in boost::_bi::bind_t<void*, void* (*)(void*), boost::_bi::list1<boost::_bi::value<mongo::PortMessageServer::HandleIncomingMsgParam*> > >::operator() (this=0x106a9ebe8) at bind_template.hpp:20
      #31 0x0000000100d8be71 in boost::detail::thread_data<boost::_bi::bind_t<void*, void* (*)(void*), boost::_bi::list1<boost::_bi::value<mongo::PortMessageServer::HandleIncomingMsgParam*> > > >::run (this=0x106a9ea00) at thread.hpp:62
      #32 0x0000000100e85369 in thread_proxy (param=0x106a9ea00) at thread.cpp:121
      #33 0x000000010409c782 in _pthread_start ()
      #34 0x00000001040891c1 in thread_start ()

      The following patch against 2.6.1 helps reproduce the issue (forces PageFaultException to be thrown at each call to Record::_accessing()):

      diff --git a/src/mongo/db/storage/record.cpp b/src/mongo/db/storage/record.cpp
      index d321223..51ed4b8 100644
      --- a/src/mongo/db/storage/record.cpp
      +++ b/src/mongo/db/storage/record.cpp
      @@ -544,8 +544,8 @@ namespace mongo {
           void Record::_accessing() const {
      -        if ( likelyInPhysicalMemory() )
      -            return;
               const Client& client = cc();
               Database* db = client.database();

      Original report from mongodb-user: <https://groups.google.com/forum/?#!topic/mongodb-user/bYAiNoPxE3c>.

            rassi J Rassi
            rassi J Rassi
            0 Vote for this issue
            5 Start watching this issue