Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-11038

Interrupting renameCollection via shutdown can cause silent data loss

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • 2.5.5
    • Affects Version/s: 2.5.2
    • Component/s: Admin, Internal Code
    • Labels:
    • ALL

      The renameCollection command (for the cross-database case) acquires a cursor into the source collection, and inserts each document one-by-one into the target collection. The cursor on the source collection is generated by a DBDirectClient query, and it is assumed that the cursor returns all documents without errors. However, if the DBDirectClient sub-op is interrupted via killOp, the cursor will stop early, and the target collection will never receive the rest of the documents. The command returns success, so the user will see no error, and an oplog entry is generated for the command (and as such, any other replicas will silently become out of sync).

      Reproduce by interrupting renameCollection during the DBDirectClient source namespace query. The interruption point is ClientCursor::staticYield(). Interruption is allowed even while holding the write lock and after writes have already been issued by the request, because staticYield() uses heedMutex=false to override the consistency check.

      Shell session (killOp not shown):

      > use test
      switched to db test
      > db.foo.count()
      10000
      > db.adminCommand({renameCollection:"test.foo",to:"test2.foo"});
      { "ok" : 1 }
      > db.foo.count()
      0
      > db.getSiblingDB("test2").foo.count()
      102
      

      Stack trace for above, at interruption:

      #0  mongo::KillCurrentOp::checkForInterrupt (this=0x101d291c0, heedMutex=false) at kill_current_op.cpp:128
      #1  0x000000010024219f in mongo::ClientCursor::staticYield (micros=10, ns=@0x10671fe30, rec=0x0) at clientcursor.cpp:379
      #2  0x00000001002427fd in mongo::ClientCursor::yield (this=0x1060b2540, micros=10, recordToLoad=0x0) at clientcursor.cpp:765
      #3  0x0000000100242953 in mongo::ClientCursor::yieldSometimes (this=0x1060b2540, need=mongo::ClientCursor::WillNeed, yielded=0x0) at clientcursor.cpp:792
      #4  0x0000000100613820 in mongo::processGetMore (ns=0x1060df544 "test2.foo", ntoreturn=0, cursorid=218920423918495, curop=@0x1060f3700, pass=0, exhaust=@0x106720643, isCursorAuthorized=0x106720627) at query.cpp:266
      #5  0x0000000100542312 in mongo::receivedGetMore (dbresponse=@0x106720e98, m=@0x106721000, curop=@0x1060f3700) at instance.cpp:777
      #6  0x000000010054a377 in mongo::assembleResponse (m=@0x106721000, dbresponse=@0x106720e98, remote=@0x101d291b0) at instance.cpp:446
      #7  0x000000010053b02d in mongo::DBDirectClient::call (this=0x106721518, toSend=@0x106721000, response=@0x106190370, assertOk=true, actualServer=0x0) at instance.cpp:1009
      #8  0x000000010053b2cd in non-virtual thunk to mongo::DBDirectClient::call(mongo::Message&, mongo::Message&, bool, std::string*) () at instance.cpp:1022
      #9  0x00000001000e9b62 in mongo::DBClientCursor::requestMore (this=0x1061da1e0) at dbclientcursor.cpp:137
      #10 0x00000001000e73a7 in mongo::DBClientCursor::more (this=0x1061da1e0) at dbclientcursor.cpp:222
      #11 0x00000001002e6377 in mongo::CmdRenameCollection::run (this=0x101d28210, dbname=@0x1067224f8, cmdObj=@0x106722aa0, unnamed_arg=0, errmsg=@0x1067224c8, result=@0x1067230d8, fromRepl=false) at rename_collection.cpp:182
      #12 0x0000000100341766 in mongo::_execCommand (c=0x101d28210, dbname=@0x1067224f8, cmdObj=@0x106722aa0, queryOptions=0, errmsg=@0x1067224c8, result=@0x1067230d8, fromRepl=false) at dbcommands.cpp:1963
      #13 0x0000000100343efc in mongo::Command::execCommand (c=0x101d28210, client=@0x1061845b0, queryOptions=0, cmdns=0x1060e6014 "admin.$cmd", cmdObj=@0x106722aa0, result=@0x1067230d8, fromRepl=false) at dbcommands.cpp:2130
      #14 0x00000001003452af in mongo::_runCommands (ns=0x1060e6014 "admin.$cmd", _cmdobj=@0x1067231c0, b=@0x106723138, anObjBuilder=@0x1067230d8, fromRepl=false, queryOptions=0) at dbcommands.cpp:2194
      #15 0x00000001006141d5 in mongo::runCommands (ns=0x1060e6014 "admin.$cmd", jsobj=@0x1067231c0, curop=@0x1060f2c00, b=@0x106723138, anObjBuilder=@0x1067230d8, fromRepl=false, queryOptions=0) at query.cpp:68
      #16 0x0000000100614fba in mongo::runQuery (m=@0x106724990, q=@0x106723ae0, curop=@0x1060f2c00, result=@0x1061902c0) at query.cpp:1045
      #17 0x000000010054627b in receivedQuery (c=@0x1061845b0, dbresponse=@0x106724640, m=@0x106724990) at instance.cpp:280
      #18 0x000000010054a349 in mongo::assembleResponse (m=@0x106724990, dbresponse=@0x106724640, remote=@0x106724690) at instance.cpp:443
      #19 0x0000000100019f89 in mongo::MyMessageHandler::process (this=0x10608c0e8, m=@0x106724990, port=0x10609a1e0, le=0x1060a7400) at db.cpp:221
      #20 0x0000000100bdfd2e in mongo::PortMessageServer::handleIncomingMsg (arg=0x1061d99a0) at message_server_port.cpp:210
      #21 0x0000000100bde041 in boost::_bi::list1<boost::_bi::value<mongo::PortMessageServer::HandleIncomingMsgParam*> >::operator()<void*, void* (*)(void*), boost::_bi::list0> (this=0x1060b9df0, f=@0x1060b9de8, a=@0x106724e10, unnamed_arg=0) at bind.hpp:243
      #22 0x0000000100bde0a6 in boost::_bi::bind_t<void*, void* (*)(void*), boost::_bi::list1<boost::_bi::value<mongo::PortMessageServer::HandleIncomingMsgParam*> > >::operator() (this=0x1060b9de8) at bind_template.hpp:20
      #23 0x0000000100bde0e1 in boost::detail::thread_data<boost::_bi::bind_t<void*, void* (*)(void*), boost::_bi::list1<boost::_bi::value<mongo::PortMessageServer::HandleIncomingMsgParam*> > > >::run (this=0x1060b9c00) at thread.hpp:62
      #24 0x0000000100cb6179 in thread_proxy (param=0x1060b9c00) at thread.cpp:121
      #25 0x00007fff8bc7a782 in _pthread_start ()
      #26 0x00007fff8bc671c1 in thread_start ()
      

            Assignee:
            Unassigned Unassigned
            Reporter:
            rassi J Rassi
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: