-
Type: Bug
-
Resolution: Done
-
Priority: Critical - P2
-
Affects Version/s: 2.4.8
-
Component/s: Index Maintenance
-
None
-
ALL
ISSUE SUMMARY
If an index build attempt creates a tenth index on a collection (counted by including the default _id index), and such an index build fails (due to a uniqueness constraint violation, for example), the index catalog can become corrupted. This will cause all subsequent inserts to this collection to fail.
USER IMPACT
If an index build in the tenth index slot fails or is interrupted, it will render the index catalog for the collection corrupt. To fix the corruption, repair the database or resync the replica set node from a healthy node.
SOLUTION
Fixing the off-by-one error when trying to remove an index build in progress fixed the issue.
WORKAROUNDS
The most common scenario that triggers the bug is an index build failure due to a uniqueness constraint violation. In this situation, the corruption can be avoided if
- the 10th index is not a unique index, or
- there are no duplicate keys, or
- the dropDups option is specified
PATCHES
The fix is included in the production release 2.4.10. Version 2.6 is unaffected by this issue.
Original Description
If the 10th index created on a collection has a unique constraint and there are duplicate key violations, the index build fails (as expected) but then hits an assertion and leaves the ns file corrupted so that all subsequent inserts fail. To reproduce:
function repro() { db.dropDatabase() db.c.ensureIndex({a:1}) db.c.ensureIndex({b:1}) db.c.ensureIndex({c:1}) db.c.ensureIndex({d:1}) db.c.ensureIndex({e:1}) db.c.ensureIndex({f:1}) db.c.ensureIndex({g:1}) db.c.ensureIndex({h:1}) // now there are 9 indexes, including _id // create duplicate records with key i db.c.insert({i:0}) db.c.insert({i:0}) // create 10th index, unique constraint, with duplicate keys: printjson(db.c.ensureIndex({i:1}, {unique:true})) // fails with "Assertion: 14045:missing Extra" // and leaves ns file corrupted: db.c.insert({}) // fails with "Assertion: 10295:getFile(): bad file number value (corrupt db?)" }
The corruption does not occur if:
- the unique index is not the 10th index, or
- there are no duplicate keys, or
- dropDups is specified
When the conditions that trigger this issue are met, the catch block at pdfile.cpp:1552 calls IndexBuildsInProgress::remove, which has special logic to deal with the rollover from the 10 base indexes to the extra indexes, so the error may lie in this vicinity. For reference here is the stack trace for the initial assertion on creating the index:
mongo::printStackTrace(std::ostream&)+0x21) [0xde46e1] mongo::msgasserted(int, char const*)+0x9b) [0xda5e1b] mongo::NamespaceDetails::idx(int, bool)+0x231) [0x8617a1] mongo::IndexBuildsInProgress::remove(char const*, int)+0x81) [0xab8181] mongo::insert_makeIndex(mongo::NamespaceDetails*, std::string const&, mongo::DiskLoc const&, bool)+0x96f) [0xac46ef] mongo::DataFileMgr::insert(char const*, void const*, int, bool, bool, bool, bool*)+0x7d2) [0xac8842] mongo::DataFileMgr::insertWithObjMod(char const*, mongo::BSONObj&, bool, bool)+0x4f) [0xaca5af] mongo::checkAndInsert(char const*, mongo::BSONObj&)+0x119) [0x9f8a69] mongo::receivedInsert(mongo::Message&, mongo::CurOp&)+0x929) [0x9f94d9] mongo::assembleResponse(mongo::Message&, mongo::DbResponse&, mongo::HostAndPort const&)+0xab8) [0x9ffd68] mongo::MyMessageHandler::process(mongo::Message&, mongo::AbstractMessagingPort*, mongo::LastError*)+0x98) [0x6e8518] mongo::PortMessageServer::handleIncomingMsg(void*)+0x42e) [0xdd0cae]
- is duplicated by
-
SERVER-12484 IndexRebuilder assertion at startup
- Closed
-
SERVER-13299 Can't create 2dsphere index
- Closed