ISSUE SUMMARY
If a database is dropped while in use by a mapReduce operation, the mongod process crashes.
USER IMPACT
mongod process crashes and needs to be restarted.
WORKAROUNDS
Before dropping any database used by mapReduce operations, either interrupt the relevant mapReduce operations or wait for them to finish.
AFFECTED VERSIONS
MongoDB 2.6 production releases up to 2.6.4 are affected by this issue.
FIX VERSION
The fix is included in the 2.6.5 production release.
RESOLUTION DETAILS
Abort mapReduce operations with an error message if the output database has disappeared.
Original description
I can reproduce this crash with 2.6.4 and master (bd50249d) but not 2.4.11.
Server output before the crash:
2014-08-28T19:35:05.984-0400 D COMMANDS [conn3] mr ns: testdb.mr1 2014-08-28T19:35:05.984-0400 D COMMANDS [conn3] run command testdb.$cmd { listCollections: 1, filter: { name: "mr1" } } 2014-08-28T19:35:06.013-0400 D COMMANDS [conn2] run command testdb.$cmd { dropDatabase: 1.0 } 2014-08-28T19:35:06.064-0400 I COMMANDS [conn2] dropDatabase testdb starting 2014-08-28T19:35:06.064-0400 I QUERY [conn3] command testdb.$cmd command: listCollections { listCollections: 1, filter: { name: "mr1" } } ntoreturn:1 keyUpdates:0 numYields:0 locks(micros) r:79551 reslen:102 79ms 2014-08-28T19:35:06.064-0400 D QUERY [conn3] V8Scope 0x104cac780 unregistered for op 0 2014-08-28T19:35:06.064-0400 D QUERY [conn3] V8Scope 0x104cac780 registered for op 0 2014-08-28T19:35:06.064-0400 D COMMANDS [conn3] run command testdb.$cmd { drop: "tmp.mr.mr1_69" } 2014-08-28T19:35:06.064-0400 I COMMANDS [conn3] CMD: drop testdb.tmp.mr.mr1_69 2014-08-28T19:35:06.065-0400 D STORAGE [conn2] dropDatabase testdb 2014-08-28T19:35:06.065-0400 D JOURNAL [conn2] journal WRITETODATAFILES 0.097ms 2014-08-28T19:35:06.068-0400 D JOURNAL [conn2] journal REMAPPRIVATEVIEW done startedAt: 4 n:1 2ms 2014-08-28T19:35:06.072-0400 I JOURNAL [conn2] removeJournalFiles 2014-08-28T19:35:06.072-0400 D JOURNAL [conn2] removeJournalFiles end 2014-08-28T19:35:06.072-0400 D JOURNAL [conn2] journal REMAPPRIVATEVIEW done startedAt: 0 n:1 0ms 2014-08-28T19:35:06.074-0400 D STORAGE [conn2] remove file /data/db/testdb.ns 2014-08-28T19:35:06.074-0400 I COMMANDS [conn2] dropDatabase testdb finished 2014-08-28T19:35:06.074-0400 I QUERY [conn2] command testdb.$cmd command: dropDatabase { dropDatabase: 1.0 } keyUpdates:0 numYields:0 locks(micros) W:10818 reslen:57 61ms 2014-08-28T19:35:06.075-0400 I STORAGE [conn3] opening db: testdb 2014-08-28T19:35:06.075-0400 I QUERY [conn3] command testdb.$cmd command: drop { drop: "tmp.mr.mr1_69" } ntoreturn:1 keyUpdates:0 numYields:0 locks(micros) w:213 reslen:62 10ms 2014-08-28T19:35:06.075-0400 D COMMANDS [conn3] run command testdb.$cmd { drop: "tmp.mr.mr1_69_inc" } 2014-08-28T19:35:06.075-0400 I COMMANDS [conn3] CMD: drop testdb.tmp.mr.mr1_69_inc 2014-08-28T19:35:06.075-0400 I QUERY [conn3] command testdb.$cmd command: drop { drop: "tmp.mr.mr1_69_inc" } ntoreturn:1 keyUpdates:0 numYields:0 locks(micros) w:9 reslen:62 0ms 2014-08-28T19:35:06.075-0400 I STORAGE [FileAllocator] allocating new datafile /data/db/testdb.ns, filling with zeroes... 2014-08-28T19:35:06.092-0400 I STORAGE [FileAllocator] done allocating datafile /data/db/testdb.ns, size: 16MB, took 0.016 secs 2014-08-28T19:35:06.143-0400 I STORAGE [FileAllocator] allocating new datafile /data/db/testdb.0, filling with zeroes... 2014-08-28T19:35:06.258-0400 D COMMANDS [conn2] run command testdb.$cmd { dropDatabase: 1.0 } 2014-08-28T19:35:06.294-0400 I STORAGE [FileAllocator] done allocating datafile /data/db/testdb.0, size: 64MB, took 0.151 secs 2014-08-28T19:35:06.330-0400 D STORAGE [conn3] MmapV1ExtentManager::allocateExtent desiredSize:8192 fromFreeList: 0 eloc: 0:2000 2014-08-28T19:35:06.330-0400 D STORAGE [conn3] MmapV1ExtentManager::allocateExtent desiredSize:8192 fromFreeList: 0 eloc: 0:4000 2014-08-28T19:35:06.337-0400 D STORAGE [conn3] MmapV1ExtentManager::allocateExtent desiredSize:8192 fromFreeList: 0 eloc: 0:6000 2014-08-28T19:35:06.345-0400 D STORAGE [conn3] allocating new extent 2014-08-28T19:35:06.345-0400 D STORAGE [conn3] MmapV1ExtentManager::allocateExtent desiredSize:131072 fromFreeList: 0 eloc: 0:8000 2014-08-28T19:35:06.346-0400 D STORAGE [conn3] testdb.tmp.mr.mr1_69_inc: clearing plan cache - collection info cache reset 2014-08-28T19:35:06.346-0400 I COMMANDS [conn2] dropDatabase testdb starting 2014-08-28T19:35:06.411-0400 D STORAGE [conn2] dropDatabase testdb 2014-08-28T19:35:06.412-0400 D JOURNAL [conn2] lsn set 98963 2014-08-28T19:35:06.412-0400 D JOURNAL [conn2] journal WRITETODATAFILES 0.116ms 2014-08-28T19:35:06.412-0400 D JOURNAL [conn2] journal REMAPPRIVATEVIEW done startedAt: 1 n:1 0ms 2014-08-28T19:35:06.418-0400 I JOURNAL [conn2] removeJournalFiles 2014-08-28T19:35:06.418-0400 D JOURNAL [conn2] removeJournalFiles end 2014-08-28T19:35:06.418-0400 D JOURNAL [conn2] journal REMAPPRIVATEVIEW done startedAt: 2 n:1 0ms 2014-08-28T19:35:06.420-0400 D STORAGE [conn2] remove file /data/db/testdb.ns 2014-08-28T19:35:06.420-0400 I COMMANDS [conn2] dropDatabase testdb finished 2014-08-28T19:35:06.420-0400 I STORAGE [conn3] opening db: testdb 2014-08-28T19:35:06.420-0400 I QUERY [conn2] command testdb.$cmd command: dropDatabase { dropDatabase: 1.0 } keyUpdates:0 numYields:0 locks(micros) W:74035 reslen:57 161ms 2014-08-28T19:35:06.420-0400 I STORAGE [FileAllocator] allocating new datafile /data/db/testdb.ns, filling with zeroes... 2014-08-28T19:35:06.437-0400 I STORAGE [FileAllocator] done allocating datafile /data/db/testdb.ns, size: 16MB, took 0.016 secs 2014-08-28T19:35:06.484-0400 I STORAGE [FileAllocator] allocating new datafile /data/db/testdb.0, filling with zeroes... 2014-08-28T19:35:06.637-0400 I STORAGE [FileAllocator] done allocating datafile /data/db/testdb.0, size: 64MB, took 0.152 secs 2014-08-28T19:35:06.679-0400 D STORAGE [conn3] MmapV1ExtentManager::allocateExtent desiredSize:8192 fromFreeList: 0 eloc: 0:2000 2014-08-28T19:35:06.679-0400 D STORAGE [conn3] MmapV1ExtentManager::allocateExtent desiredSize:8192 fromFreeList: 0 eloc: 0:4000 2014-08-28T19:35:06.687-0400 D STORAGE [conn3] MmapV1ExtentManager::allocateExtent desiredSize:8192 fromFreeList: 0 eloc: 0:6000 2014-08-28T19:35:06.692-0400 D STORAGE [conn3] allocating new extent 2014-08-28T19:35:06.692-0400 D STORAGE [conn3] MmapV1ExtentManager::allocateExtent desiredSize:131072 fromFreeList: 0 eloc: 0:8000 2014-08-28T19:35:06.692-0400 D STORAGE [conn3] testdb.tmp.mr.mr1_69: clearing plan cache - collection info cache reset 2014-08-28T19:35:06.692-0400 D QUERY [conn3] Collection testdb.mr1 does not exist. Using EOF plan: query: {} sort: {} projection: {} skip: 0 limit: 0 2014-08-28T19:35:06.694-0400 F - [conn3] Invalid access at address: 0xd8 2014-08-28T19:35:06.694-0400 D JOURNAL [journal] lsn set 99293 2014-08-28T19:35:06.694-0400 D JOURNAL [journal] journal WRITETODATAFILES 0.132ms 2014-08-28T19:35:06.696-0400 F - [conn3] Got signal: 11 (Segmentation fault: 11).
Valgrind output:
==31589== Invalid read of size 8 ==31589== at 0x11DBEA8: std::vector<mongo::IndexCatalogEntry*, std::allocator<mongo::IndexCatalogEntry*> >::begin() const (stl_vector.h:548) ==31589== by 0x11DB547: mongo::IndexCatalogEntryContainer::begin() const (index_catalog_entry.h:125) ==31589== by 0x11D9C27: mongo::IndexCatalog::IndexIterator::IndexIterator(mongo::OperationContext*, mongo::IndexCatalog const*, bool) (index_catalog.cpp:843) ==31589== by 0x11BF578: mongo::IndexCatalog::getIndexIterator(mongo::OperationContext*, bool) const (index_catalog.h:154) ==31589== by 0x1227A26: mongo::mr::State::finalReduce(mongo::CurOp*, mongo::ProgressMeterHolder&) (mr.cpp:966) ==31589== by 0x122B550: mongo::mr::MapReduceCommand::run(mongo::OperationContext*, std::string const&, mongo::BSONObj&, int, std::string&, mongo::BSONObjBuilder&, bool) (mr.cpp:1433) ==31589== by 0x1283F64: mongo::_execCommand(mongo::OperationContext*, mongo::Command*, std::string const&, mongo::BSONObj&, int, std::string&, mongo::BSONObjBuilder&, bool) (dbcommands.cpp:1156) ==31589== by 0x1284EA6: mongo::Command::execCommand(mongo::OperationContext*, mongo::Command*, mongo::Client&, int, char const*, mongo::BSONObj&, mongo::BSONObjBuilder&, bool) (dbcommands.cpp:1369) ==31589== by 0x1285780: mongo::_runCommands(mongo::OperationContext*, char const*, mongo::BSONObj&, mongo::_BufBuilder<mongo::TrivialAllocator>&, mongo::BSONObjBuilder&, bool, int) (dbcommands.cpp:1443) ==31589== by 0x149DD6B: mongo::runCommands(mongo::OperationContext*, char const*, mongo::BSONObj&, mongo::CurOp&, mongo::_BufBuilder<mongo::TrivialAllocator>&, mongo::BSONObjBuilder&, bool, int) (new_find.cpp:128) ==31589== by 0x149F663: mongo::newRunQuery(mongo::OperationContext*, mongo::Message&, mongo::QueryMessage&, mongo::CurOp&, mongo::Message&) (new_find.cpp:469) ==31589== by 0x1385C3F: mongo::receivedQuery(mongo::OperationContext*, mongo::Client&, mongo::DbResponse&, mongo::Message&) (instance.cpp:262) ==31589== by 0x1386D43: mongo::assembleResponse(mongo::OperationContext*, mongo::Message&, mongo::DbResponse&, mongo::HostAndPort const&) (instance.cpp:436) ==31589== by 0x10AECB9: mongo::MyMessageHandler::process(mongo::Message&, mongo::AbstractMessagingPort*, mongo::LastError*) (db.cpp:196) ==31589== by 0x17F7DC6: mongo::PortMessageServer::handleIncomingMsg(void*) (message_server_port.cpp:227) ==31589== by 0x4E3F181: start_thread (pthread_create.c:312) ==31589== by 0x5D7BFBC: clone (clone.S:111) ==31589== Address 0xd8 is not stack'd, malloc'd or (recently) free'd ==31589== 2014-08-28T17:48:42.128-0400 F - [conn148] Invalid access at address: 0xd8 2014-08-28T17:48:42.145-0400 I NETWORK [conn151] end connection 127.0.0.1:44054 (1 connection now open) 2014-08-28T17:48:42.265-0400 F - [conn148] Got signal: 11 (Segmentation fault).
- related to
-
SERVER-15842 Server aborts during concurrent mapreduce and drop database
- Closed