Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-11037

Interrupting repairDatabase can leak temporary collections

    XMLWordPrintableJSON

Details

    • Icon: Bug Bug
    • Resolution: Won't Fix
    • Icon: Major - P3 Major - P3
    • None
    • 2.5.2
    • MMAPv1, Storage
    • Storage Execution
    • ALL

    Description

      The repairDatabase command clones the target database into a temporary directory, and then moves the data files from that directory into their final location. If the repairDatabase command is interrupted with killOp during the clone, the generated temporary directory is never removed (even after restart), and consumes disk space.

      Reproduce by interrupting the clone, while between collections. The interruption point is mayInterrupt(). repairDatabase explicitly allows interruption at all mayInterrupt() calls with heedMutex=false.

      Sample session (killOp not shown):

      > db.repairDatabase()
      {
      	"errmsg" : "exception: operation was interrupted",
      	"code" : 11601,
      	"ok" : 0
      }
      > db.repairDatabase()
      {
      	"errmsg" : "exception: operation was interrupted",
      	"code" : 11601,
      	"ok" : 0
      }
      > db.repairDatabase()
      {
      	"errmsg" : "exception: operation was interrupted",
      	"code" : 11601,
      	"ok" : 0
      }
      >
      [1]+  Stopped                 mongo
      rassi@laptop:~/work/mongo $ du -hs /data/db/_tmp_repairDatabase_*
      2.0G	/data/db/_tmp_repairDatabase_0
      2.0G	/data/db/_tmp_repairDatabase_1
      2.0G	/data/db/_tmp_repairDatabase_2

      Stack trace for above, at interruption:

      #0  mongo::KillCurrentOp::checkForInterrupt (this=0x101d291c0, heedMutex=false) at kill_current_op.cpp:141
      #1  0x000000010025277e in mongo::mayInterrupt (mayBeInterrupted=true) at cloner.cpp:64
      #2  0x0000000100255a40 in mongo::Cloner::go (this=0x1067a3b38, masterHost=0x1060df6f8 "localhost:27017", opts=@0x1067a3a38, clonedColls=@0x1067a3a08, errmsg=@0x1067a54c8, errCode=0x0) at cloner.cpp:439
      #3  0x0000000100257fd5 in mongo::Cloner::go (this=0x1067a3b38, masterHost=0x1060df6f8 "localhost:27017", errmsg=@0x1067a54c8, fromdb=@0x1067a4828, logForRepl=false, slaveOk=false, useReplAuth=false, snapshot=false, mayYield=false, mayBeInterrupted=true, errCode=0x0) at cloner.cpp:326
      #4  0x0000000100258210 in mongo::Cloner::cloneFrom (masterHost=0x1060df6f8 "localhost:27017", errmsg=@0x1067a54c8, fromdb=@0x1067a4828, logForReplication=false, slaveOk=false, useReplAuth=false, snapshot=false, mayYield=false, mayBeInterrupted=true, errCode=0x0) at cloner.cpp:520
      #5  0x0000000100632d02 in mongo::repairDatabase (dbNameS=@0x1067a4d28, errmsg=@0x1067a54c8, preserveClonedFilesOnFailure=false, backupOriginalFiles=false) at pdfile.cpp:1407
      #6  0x000000010034b066 in mongo::CmdRepairDatabase::run (this=0x101d28470, dbname=@0x1067a54f8, cmdObj=@0x1067a5aa0, unnamed_arg=0, errmsg=@0x1067a54c8, result=@0x1067a60d8, fromRepl=false) at dbcommands.cpp:366
      #7  0x0000000100341766 in mongo::_execCommand (c=0x101d28470, dbname=@0x1067a54f8, cmdObj=@0x1067a5aa0, queryOptions=0, errmsg=@0x1067a54c8, result=@0x1067a60d8, fromRepl=false) at dbcommands.cpp:1963
      #8  0x0000000100343efc in mongo::Command::execCommand (c=0x101d28470, client=@0x106186680, queryOptions=0, cmdns=0x1061f0014 "test.$cmd", cmdObj=@0x1067a5aa0, result=@0x1067a60d8, fromRepl=false) at dbcommands.cpp:2130
      #9  0x00000001003452af in mongo::_runCommands (ns=0x1061f0014 "test.$cmd", _cmdobj=@0x1067a61c0, b=@0x1067a6138, anObjBuilder=@0x1067a60d8, fromRepl=false, queryOptions=0) at dbcommands.cpp:2194
      #10 0x00000001006141d5 in mongo::runCommands (ns=0x1061f0014 "test.$cmd", jsobj=@0x1067a61c0, curop=@0x1060f3180, b=@0x1067a6138, anObjBuilder=@0x1067a60d8, fromRepl=false, queryOptions=0) at query.cpp:68
      #11 0x0000000100614fba in mongo::runQuery (m=@0x1067a7990, q=@0x1067a6ae0, curop=@0x1060f3180, result=@0x106190160) at query.cpp:1045
      #12 0x000000010054627b in receivedQuery (c=@0x106186680, dbresponse=@0x1067a7640, m=@0x1067a7990) at instance.cpp:280
      #13 0x000000010054a349 in mongo::assembleResponse (m=@0x1067a7990, dbresponse=@0x1067a7640, remote=@0x1067a7690) at instance.cpp:443
      #14 0x0000000100019f69 in mongo::MyMessageHandler::process (this=0x10608c0e8, m=@0x1067a7990, port=0x10609a1e0, le=0x1060a7940) at db.cpp:221
      #15 0x0000000100bdfd2e in mongo::PortMessageServer::handleIncomingMsg (arg=0x1061d9600) at message_server_port.cpp:210
      #16 0x0000000100bde041 in boost::_bi::list1<boost::_bi::value<mongo::PortMessageServer::HandleIncomingMsgParam*> >::operator()<void*, void* (*)(void*), boost::_bi::list0> (this=0x1060b9df0, f=@0x1060b9de8, a=@0x1067a7e10, unnamed_arg=0) at bind.hpp:243
      #17 0x0000000100bde0a6 in boost::_bi::bind_t<void*, void* (*)(void*), boost::_bi::list1<boost::_bi::value<mongo::PortMessageServer::HandleIncomingMsgParam*> > >::operator() (this=0x1060b9de8) at bind_template.hpp:20
      #18 0x0000000100bde0e1 in boost::detail::thread_data<boost::_bi::bind_t<void*, void* (*)(void*), boost::_bi::list1<boost::_bi::value<mongo::PortMessageServer::HandleIncomingMsgParam*> > > >::run (this=0x1060b9c00) at thread.hpp:62
      #19 0x0000000100cb6179 in thread_proxy (param=0x1060b9c00) at thread.cpp:121
      #20 0x00007fff8bc7a782 in _pthread_start ()
      #21 0x00007fff8bc671c1 in thread_start ()

      Attachments

        Activity

          People

            backlog-server-execution Backlog - Storage Execution Team
            rassi J Rassi
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: