ISSUE SUMMARY
If a database is dropped while in use by a mapReduce operation, the mongod process crashes.
USER IMPACT
mongod process crashes and needs to be restarted.
WORKAROUNDS
Before dropping any database used by mapReduce operations, either interrupt the relevant mapReduce operations or wait for them to finish.
AFFECTED VERSIONS
MongoDB 2.6 production releases up to 2.6.4 are affected by this issue.
FIX VERSION
The fix is included in the 2.6.5 production release.
RESOLUTION DETAILS
Abort mapReduce operations with an error message if the output database has disappeared.
Original description
I can reproduce this crash with 2.6.4 and master (bd50249d) but not 2.4.11.
Server output before the crash:
2014-08-28T19:35:05.984-0400 D COMMANDS [conn3] mr ns: testdb.mr1
2014-08-28T19:35:05.984-0400 D COMMANDS [conn3] run command testdb.$cmd { listCollections: 1, filter: { name: "mr1" } }
2014-08-28T19:35:06.013-0400 D COMMANDS [conn2] run command testdb.$cmd { dropDatabase: 1.0 }
2014-08-28T19:35:06.064-0400 I COMMANDS [conn2] dropDatabase testdb starting
2014-08-28T19:35:06.064-0400 I QUERY [conn3] command testdb.$cmd command: listCollections { listCollections: 1, filter: { name: "mr1" } } ntoreturn:1 keyUpdates:0 numYields:0 locks(micros) r:79551 reslen:102 79ms
2014-08-28T19:35:06.064-0400 D QUERY [conn3] V8Scope 0x104cac780 unregistered for op 0
2014-08-28T19:35:06.064-0400 D QUERY [conn3] V8Scope 0x104cac780 registered for op 0
2014-08-28T19:35:06.064-0400 D COMMANDS [conn3] run command testdb.$cmd { drop: "tmp.mr.mr1_69" }
2014-08-28T19:35:06.064-0400 I COMMANDS [conn3] CMD: drop testdb.tmp.mr.mr1_69
2014-08-28T19:35:06.065-0400 D STORAGE [conn2] dropDatabase testdb
2014-08-28T19:35:06.065-0400 D JOURNAL [conn2] journal WRITETODATAFILES 0.097ms
2014-08-28T19:35:06.068-0400 D JOURNAL [conn2] journal REMAPPRIVATEVIEW done startedAt: 4 n:1 2ms
2014-08-28T19:35:06.072-0400 I JOURNAL [conn2] removeJournalFiles
2014-08-28T19:35:06.072-0400 D JOURNAL [conn2] removeJournalFiles end
2014-08-28T19:35:06.072-0400 D JOURNAL [conn2] journal REMAPPRIVATEVIEW done startedAt: 0 n:1 0ms
2014-08-28T19:35:06.074-0400 D STORAGE [conn2] remove file /data/db/testdb.ns
2014-08-28T19:35:06.074-0400 I COMMANDS [conn2] dropDatabase testdb finished
2014-08-28T19:35:06.074-0400 I QUERY [conn2] command testdb.$cmd command: dropDatabase { dropDatabase: 1.0 } keyUpdates:0 numYields:0 locks(micros) W:10818 reslen:57 61ms
2014-08-28T19:35:06.075-0400 I STORAGE [conn3] opening db: testdb
2014-08-28T19:35:06.075-0400 I QUERY [conn3] command testdb.$cmd command: drop { drop: "tmp.mr.mr1_69" } ntoreturn:1 keyUpdates:0 numYields:0 locks(micros) w:213 reslen:62 10ms
2014-08-28T19:35:06.075-0400 D COMMANDS [conn3] run command testdb.$cmd { drop: "tmp.mr.mr1_69_inc" }
2014-08-28T19:35:06.075-0400 I COMMANDS [conn3] CMD: drop testdb.tmp.mr.mr1_69_inc
2014-08-28T19:35:06.075-0400 I QUERY [conn3] command testdb.$cmd command: drop { drop: "tmp.mr.mr1_69_inc" } ntoreturn:1 keyUpdates:0 numYields:0 locks(micros) w:9 reslen:62 0ms
2014-08-28T19:35:06.075-0400 I STORAGE [FileAllocator] allocating new datafile /data/db/testdb.ns, filling with zeroes...
2014-08-28T19:35:06.092-0400 I STORAGE [FileAllocator] done allocating datafile /data/db/testdb.ns, size: 16MB, took 0.016 secs
2014-08-28T19:35:06.143-0400 I STORAGE [FileAllocator] allocating new datafile /data/db/testdb.0, filling with zeroes...
2014-08-28T19:35:06.258-0400 D COMMANDS [conn2] run command testdb.$cmd { dropDatabase: 1.0 }
2014-08-28T19:35:06.294-0400 I STORAGE [FileAllocator] done allocating datafile /data/db/testdb.0, size: 64MB, took 0.151 secs
2014-08-28T19:35:06.330-0400 D STORAGE [conn3] MmapV1ExtentManager::allocateExtent desiredSize:8192 fromFreeList: 0 eloc: 0:2000
2014-08-28T19:35:06.330-0400 D STORAGE [conn3] MmapV1ExtentManager::allocateExtent desiredSize:8192 fromFreeList: 0 eloc: 0:4000
2014-08-28T19:35:06.337-0400 D STORAGE [conn3] MmapV1ExtentManager::allocateExtent desiredSize:8192 fromFreeList: 0 eloc: 0:6000
2014-08-28T19:35:06.345-0400 D STORAGE [conn3] allocating new extent
2014-08-28T19:35:06.345-0400 D STORAGE [conn3] MmapV1ExtentManager::allocateExtent desiredSize:131072 fromFreeList: 0 eloc: 0:8000
2014-08-28T19:35:06.346-0400 D STORAGE [conn3] testdb.tmp.mr.mr1_69_inc: clearing plan cache - collection info cache reset
2014-08-28T19:35:06.346-0400 I COMMANDS [conn2] dropDatabase testdb starting
2014-08-28T19:35:06.411-0400 D STORAGE [conn2] dropDatabase testdb
2014-08-28T19:35:06.412-0400 D JOURNAL [conn2] lsn set 98963
2014-08-28T19:35:06.412-0400 D JOURNAL [conn2] journal WRITETODATAFILES 0.116ms
2014-08-28T19:35:06.412-0400 D JOURNAL [conn2] journal REMAPPRIVATEVIEW done startedAt: 1 n:1 0ms
2014-08-28T19:35:06.418-0400 I JOURNAL [conn2] removeJournalFiles
2014-08-28T19:35:06.418-0400 D JOURNAL [conn2] removeJournalFiles end
2014-08-28T19:35:06.418-0400 D JOURNAL [conn2] journal REMAPPRIVATEVIEW done startedAt: 2 n:1 0ms
2014-08-28T19:35:06.420-0400 D STORAGE [conn2] remove file /data/db/testdb.ns
2014-08-28T19:35:06.420-0400 I COMMANDS [conn2] dropDatabase testdb finished
2014-08-28T19:35:06.420-0400 I STORAGE [conn3] opening db: testdb
2014-08-28T19:35:06.420-0400 I QUERY [conn2] command testdb.$cmd command: dropDatabase { dropDatabase: 1.0 } keyUpdates:0 numYields:0 locks(micros) W:74035 reslen:57 161ms
2014-08-28T19:35:06.420-0400 I STORAGE [FileAllocator] allocating new datafile /data/db/testdb.ns, filling with zeroes...
2014-08-28T19:35:06.437-0400 I STORAGE [FileAllocator] done allocating datafile /data/db/testdb.ns, size: 16MB, took 0.016 secs
2014-08-28T19:35:06.484-0400 I STORAGE [FileAllocator] allocating new datafile /data/db/testdb.0, filling with zeroes...
2014-08-28T19:35:06.637-0400 I STORAGE [FileAllocator] done allocating datafile /data/db/testdb.0, size: 64MB, took 0.152 secs
2014-08-28T19:35:06.679-0400 D STORAGE [conn3] MmapV1ExtentManager::allocateExtent desiredSize:8192 fromFreeList: 0 eloc: 0:2000
2014-08-28T19:35:06.679-0400 D STORAGE [conn3] MmapV1ExtentManager::allocateExtent desiredSize:8192 fromFreeList: 0 eloc: 0:4000
2014-08-28T19:35:06.687-0400 D STORAGE [conn3] MmapV1ExtentManager::allocateExtent desiredSize:8192 fromFreeList: 0 eloc: 0:6000
2014-08-28T19:35:06.692-0400 D STORAGE [conn3] allocating new extent
2014-08-28T19:35:06.692-0400 D STORAGE [conn3] MmapV1ExtentManager::allocateExtent desiredSize:131072 fromFreeList: 0 eloc: 0:8000
2014-08-28T19:35:06.692-0400 D STORAGE [conn3] testdb.tmp.mr.mr1_69: clearing plan cache - collection info cache reset
2014-08-28T19:35:06.692-0400 D QUERY [conn3] Collection testdb.mr1 does not exist. Using EOF plan: query: {} sort: {} projection: {} skip: 0 limit: 0
2014-08-28T19:35:06.694-0400 F - [conn3] Invalid access at address: 0xd8
2014-08-28T19:35:06.694-0400 D JOURNAL [journal] lsn set 99293
2014-08-28T19:35:06.694-0400 D JOURNAL [journal] journal WRITETODATAFILES 0.132ms
2014-08-28T19:35:06.696-0400 F - [conn3] Got signal: 11 (Segmentation fault: 11).
Valgrind output:
==31589== Invalid read of size 8 ==31589== at 0x11DBEA8: std::vector<mongo::IndexCatalogEntry*, std::allocator<mongo::IndexCatalogEntry*> >::begin() const (stl_vector.h:548) ==31589== by 0x11DB547: mongo::IndexCatalogEntryContainer::begin() const (index_catalog_entry.h:125) ==31589== by 0x11D9C27: mongo::IndexCatalog::IndexIterator::IndexIterator(mongo::OperationContext*, mongo::IndexCatalog const*, bool) (index_catalog.cpp:843) ==31589== by 0x11BF578: mongo::IndexCatalog::getIndexIterator(mongo::OperationContext*, bool) const (index_catalog.h:154) ==31589== by 0x1227A26: mongo::mr::State::finalReduce(mongo::CurOp*, mongo::ProgressMeterHolder&) (mr.cpp:966) ==31589== by 0x122B550: mongo::mr::MapReduceCommand::run(mongo::OperationContext*, std::string const&, mongo::BSONObj&, int, std::string&, mongo::BSONObjBuilder&, bool) (mr.cpp:1433) ==31589== by 0x1283F64: mongo::_execCommand(mongo::OperationContext*, mongo::Command*, std::string const&, mongo::BSONObj&, int, std::string&, mongo::BSONObjBuilder&, bool) (dbcommands.cpp:1156) ==31589== by 0x1284EA6: mongo::Command::execCommand(mongo::OperationContext*, mongo::Command*, mongo::Client&, int, char const*, mongo::BSONObj&, mongo::BSONObjBuilder&, bool) (dbcommands.cpp:1369) ==31589== by 0x1285780: mongo::_runCommands(mongo::OperationContext*, char const*, mongo::BSONObj&, mongo::_BufBuilder<mongo::TrivialAllocator>&, mongo::BSONObjBuilder&, bool, int) (dbcommands.cpp:1443) ==31589== by 0x149DD6B: mongo::runCommands(mongo::OperationContext*, char const*, mongo::BSONObj&, mongo::CurOp&, mongo::_BufBuilder<mongo::TrivialAllocator>&, mongo::BSONObjBuilder&, bool, int) (new_find.cpp:128) ==31589== by 0x149F663: mongo::newRunQuery(mongo::OperationContext*, mongo::Message&, mongo::QueryMessage&, mongo::CurOp&, mongo::Message&) (new_find.cpp:469) ==31589== by 0x1385C3F: mongo::receivedQuery(mongo::OperationContext*, mongo::Client&, mongo::DbResponse&, mongo::Message&) (instance.cpp:262) ==31589== by 0x1386D43: mongo::assembleResponse(mongo::OperationContext*, mongo::Message&, mongo::DbResponse&, mongo::HostAndPort const&) (instance.cpp:436) ==31589== by 0x10AECB9: mongo::MyMessageHandler::process(mongo::Message&, mongo::AbstractMessagingPort*, mongo::LastError*) (db.cpp:196) ==31589== by 0x17F7DC6: mongo::PortMessageServer::handleIncomingMsg(void*) (message_server_port.cpp:227) ==31589== by 0x4E3F181: start_thread (pthread_create.c:312) ==31589== by 0x5D7BFBC: clone (clone.S:111) ==31589== Address 0xd8 is not stack'd, malloc'd or (recently) free'd ==31589== 2014-08-28T17:48:42.128-0400 F - [conn148] Invalid access at address: 0xd8 2014-08-28T17:48:42.145-0400 I NETWORK [conn151] end connection 127.0.0.1:44054 (1 connection now open) 2014-08-28T17:48:42.265-0400 F - [conn148] Got signal: 11 (Segmentation fault).
- related to
-
SERVER-15842 Server aborts during concurrent mapreduce and drop database
-
- Closed
-