Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-5393

Killing mongod during a repair can leave it unable to start up

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • 2.1.2
    • Affects Version/s: None
    • Component/s: Stability, Storage
    • Labels:
      None
    • ALL

      Hard killing a mongod process during a repair at the wrong time can leave it unable to startup.

      Steps to repro:

      • Create a database with lots of large documents (in my test I inserted ~15,000 1MB documents)
      • Run a repair, and after it has been running for some time kill the process with kill -9
      • Try to start up mongod normally
        I see the following message:
      mongod --port 12345
      Sat Mar 24 13:12:38 [initandlisten] MongoDB starting : pid=16292 port=12345 dbpath=/data/db/ 64-bit host=Spencer-MacBook.local
      Sat Mar 24 13:12:38 [initandlisten] db version v2.0.2, pdfile version 4.5
      Sat Mar 24 13:12:38 [initandlisten] git version: 514b122d308928517f5841888ceaa4246a7f18e3
      Sat Mar 24 13:12:38 [initandlisten] build info: Darwin Spencer-MacBook.local 10.8.0 Darwin Kernel Version 10.8.0: Tue Jun  7 16:32:41 PDT 2011; root:xnu-1504.15.3~1/RELEASE_X86_64 x86_64 BOOST_LIB_VERSION=1_47
      Sat Mar 24 13:12:38 [initandlisten] options: { port: 12345 }
      Sat Mar 24 13:12:38 [initandlisten] journal dir=/data/db/journal
      Sat Mar 24 13:12:38 [initandlisten] recover begin
      Sat Mar 24 13:12:38 [initandlisten] recover lsn: 0
      Sat Mar 24 13:12:38 [initandlisten] recover /data/db/journal/j._3
      Sat Mar 24 13:12:38 [initandlisten] exception during recovery
      Sat Mar 24 13:12:38 [initandlisten] exception in initAndListen std::exception: boost::filesystem::file_size: No such file or directory: "/data/db/$tmp_repairDatabase_1/test.ns", terminating
      Sat Mar 24 13:12:38 dbexit:
      Sat Mar 24 13:12:38 [initandlisten] shutdown: going to close listening sockets...
      Sat Mar 24 13:12:38 [initandlisten] shutdown: going to flush diaglog...
      Sat Mar 24 13:12:38 [initandlisten] shutdown: going to close sockets...
      Sat Mar 24 13:12:38 [initandlisten] shutdown: waiting for fs preallocator...
      Sat Mar 24 13:12:38 [initandlisten] shutdown: lock for final commit...
      Sat Mar 24 13:12:38 [initandlisten] shutdown: final commit...
      Sat Mar 24 13:12:38 [initandlisten] shutdown: closing all files...
      Sat Mar 24 13:12:38 [initandlisten] closeAllFiles() finished
      Sat Mar 24 13:12:38 [initandlisten] shutdown: removing fs lock...
      Sat Mar 24 13:12:38 dbexit: really exiting now
      

      Note that I wasn't able to reproduce this consistently, I had to run the repair and kill several times to get it to happen.

            Assignee:
            mathias@mongodb.com Mathias Stearn
            Reporter:
            spencer@mongodb.com Spencer Brody (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: