-
Type: Bug
-
Resolution: Fixed
-
Priority: Critical - P2
-
Affects Version/s: 4.2.0, 4.4.0, 5.0.0, 6.0.0, 6.3.0-rc3
-
Component/s: Index Maintenance
-
None
-
Storage Execution
-
Fully Compatible
-
ALL
-
v7.0, v6.3, v6.0, v5.0, v4.4, v4.2
-
Execution Team 2023-04-17, Execution Team 2023-05-01
-
(copied to CRM)
ISSUE DESCRIPTION AND IMPACT
Users who upgraded from MongoDB 4.0 to 4.2+ (featureCompatibility 4.2+) may experience index inconsistencies within unique partial indexes (unique indexes which specify a partialFilterExpression), in the form of missing index keys. Unique primary (_id) indexes are not affected.
Affected versions of MongoDB incorrectly remove index entries from the unique partial index when all of the following take place:
- prior to upgrading to FCV 4.2+, a document has a key in a unique partial index because it matches the index's partialFilterExpression.
- prior to upgrading to FCV 4.2+, another document with a matching unique field value exists, but is not contained in the unique partial index (because it does not match the index's partialFilterExpression).
- The un-indexed document is deleted after the upgrade to FCV 4.2+.
When the un-indexed document is deleted, affected versions of MongoDB incorrectly delete the key for the indexed document.
Missing index entries in unique partial index in v4.2+ can have the following effects:
- Queries using the affected index may return incomplete results.
- MongoDB will incorrectly allow the insertion of a new document that matches the partialFilterExpression, even though the insert should fail with a duplicate key error. As a result, queries that do not use an affected unique index may return documents with duplicate values that should not have been allowed.
- Attempts to drop and rebuild the unique partial index may fail due to duplicate keys existing in the collection.
DIAGNOSIS AND AFFECTED VERSIONS
Affected Versions: 4.2.0, 4.4.0, 5.0.0, 6.0.0, 6.3.0-rc3
Fixed Versions: 4.4.23, 5.0.19, 6.0.7, 7.0.0-rc3, 7.1.0-rc0
Users who upgraded to v4.2 and are still running in FCV 4.0 are not impacted. Even after upgrading to a fixed version, users can still be impacted from missing index entries and documents with duplicate index keys.
In FCV 4.2+, users who have collections that rely on unique indexes with partialFilterExpression may be impacted by this bug. Users can verify if a unique partial index is set by calling `getIndexes` on all the collections. Missing unique index entries can be checked by calling validate.
REMEDIATION AND WORKAROUNDS
User action is required in order to remediate this issue. Impacted users should follow the below steps:
- Upgrade to a fixed version.
- Check for “missingIndexEntries” by running the validate() command.
- If there are no missing index entries, no more remediation measures are needed.
- If there are missing index entries, drop and rebuild the index
Rebuilding the index may fail due to existing documents with duplicate keys in the collection.
- Build a new valid index.
- Query using that index to find the duplicate values in the collection
Example://bad index db.coll.createIndex({ “user” : 1 }, { “partialFilterExpression” : { “active” : true }, “unique” : true }) //new index db.coll.createIndex({ “user” : 1, “foo” : 1 }, { “partialFilterExpression” : { “active” : true } }) > db.coll.aggregate( ... [ ... { ... $group: { ... _id: oldindex, //{"user": "x"} ... count: { $sum: 1 }, ... }, ... }, ... { ... $match: { ... count: { $gt: 1 } ... } ... }, ... ], ... {hint: (newindex)} //"user_1_foo_1" ... ) Returns: { "_id" : { "user" : "x" }, "count" : 2 } > db.coll.find({ "user" : "x" }) { "_id" : 2, "user" : "x", "active" : true } { "_id" : 3, "user" : "x", "active" : true }
- The duplicate documents may be removed based on the application logic.
- Once the user action is taken for the affected documents, drop and recreate the unique index for all collections
Original description
The index format for unique indexes changed between MongoDB 4.0 and 4.2.
- In MongoDB 4.0, the key portion of the index entry contains only the [KeyString blob of the indexed value]. The value portion of the index entry contains a list of RecordIds and the type bits of the indexed value. Outside of secondary oplog application when secondary unique index constraints are temporarily relaxed, the value portion of the index entry would be a list with exactly one RecordId.
- In MongoDB 4.2 and later, the key portion of the index entry contains the combination of the [KeyString blob of the indexed value] + [the RecordId of the indexed document]. The value portion of the RecordId of the indexed document (redundantly) and the type bits of the indexed value.
In FCV 4.2 and greater, a mongod writes the new format for index entries while still supporting the ability to read both formats. The in-place conversion incorrectly assumes that when the document is being unindexed the index entry in the 4.0 format can always be removed. This assumption results in documents having missing index keys in a very similar way to SERVER-28546. The indexing behavior around partial indexes is for SortedDataInterface::unindex() to be called and for the storage glue layer to tolerate how the document may never have been indexed. The in-place conversion should instead be checking the value portion of the 4.0 format's index entry, and, only if the RecordId matches the document being unindexed, proceed to remove the index entry (see also 052345f).
- is related to
-
SERVER-28546 Documents can erroneously be unindexed from a partial index
- Closed
-
SERVER-88225 Repairing a unique index with duplicates may not work with old format indexes
- Closed
-
SERVER-76344 Support multiversion testing with the oldest binaries up through the latest binaries
- Closed
-
SERVER-32821 Support rolling upgrade to new unique index format
- Closed
-
SERVER-51762 Delete code for old unique index format
- Closed
- related to
-
SERVER-85536 [4.4] removing unindexed unique partial index entries generates write conflicts
- Closed
-
SERVER-51762 Delete code for old unique index format
- Closed