[SERVER-58936] Unique index constraints may not be enforced Created: 28/Jul/21 Updated: 29/Oct/23 Resolved: 28/Jul/21 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | 4.4.7, 5.0.0-rc2 |
| Fix Version/s: | 4.4.8, 5.0.2, 5.1.0-rc0 |
| Type: | Task | Priority: | Blocker - P1 |
| Reporter: | Jonathan Streets (Inactive) | Assignee: | Jonathan Streets (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
||||||||||||||||||||||||
| Issue Links: |
|
||||||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||
| Case: | (copied to CRM) | ||||||||||||||||||||||||
| Description |
|
Issue Summary as of August 17, 2021 ISSUE DESCRIPTION AND IMPACT If this bug is exercised and multiple documents exist in a collection violating a unique index constraint, subsequent delete operations using the affected unique index will only modify half (rounding up) of the affected documents being targeted per execution. This is a result of internal optimizations that rely on uniqueness. Query and update operations are not affected and will return all targeted documents. DIAGNOSIS AND REMEDIATION Deployments on the affected versions that rely on unique indexes apart from the _id index should be upgraded to MongoDB 4.4.8 or 5.0.2 as soon as possible. After upgrading to a version that is not impacted by this bug, users can determine whether they have been impacted by using the validate() command to validate all collections or by running the attached script, findUniquenessViolations.js This script iterates through every database and collection in the cluster looking for unique indexes that are not the _id index. For each unique index that it finds, it will perform an operation to list:
As this script will potentially perform multiple index scans, we would recommend issuing it against a secondary to minimize production impact. Running the script Here is an example invocation of the script (you may also use the legacy mongo shell), which will output results to results.txt in the current directory:
You can inspect the affected documents by querying on the provided _ids. Depending on the results and application logic, it may be safe to remove the duplicated documents, otherwise more involved reconciliation may be required. For example, in this case, reviewing the affected documents we can see that they all match:
Therefore, based on our knowledge of the application, we can safely remove all but one using the _id:
Additional Option: Specifying namespaces to query
You may want to run the script only against namespaces that have been skipped. You can do this by modifying the script and providing an array of namespaces with the format ‘database_name.collection_name’ in the namespace variable. Namespaces containing the admin, local, and config databases are unlikely to contain duplicate documents and may be ignored. For all other namespaces, verify that the user running the script has sufficient permissions to read the namespace.
Additional Option: Automatic cleanup If you are absolutely certain that inserted documents will be materially similar, this script can be leveraged to delete all but either the newest or oldest of each set of duplicates. This option is disabled default and is only suitable if the contents of the duplicate documents are materially similar for your use-case. Warning: Use this facility only with extreme care as documents targeted by the script will be permanently deleted and does not back up or output the contents of those documents. To use this script to clean up duplicate documents without regard for application-specific logic, uncomment the declaration of cleanupType in the script and set that variable to either delete_oldest or delete_newest. This ticket will track the reverts of |
| Comments |
| Comment by Edwin Zhou [ 23/Aug/21 ] | |||
|
Hi deyan@nuxni.com, Thanks for reporting an issue with this script. This seems to be an issue with the legacy mongo shell where EJSON is not supported. Will you please attempt to run this script using mongosh? You may download the new shell using the instructions listed here. Please download the latest version of the script, findUniquenessViolations_latest.js
Best, | |||
| Comment by Deyan Petrov [ 23/Aug/21 ] | |||
|
Hi,
Running the latest script gives me now errors like this:
The 30089_findUniquenessViolation-dotReplacement.js was actually working ...
Br, Deyan | |||
| Comment by Kelsey Schubert [ 13/Aug/21 ] | |||
|
Thanks for reporting this issue with the script. If you are seeing the following error message,
when running the original script as a user with sufficient privileges, you may need to either enable allow disk use or escape dots and dollars from your index specification. I've uploaded a new script, findUniquenessViolations_latest.js Thank you, | |||
| Comment by Deyan Petrov [ 13/Aug/21 ] | |||
|
It does not work also for indexes on child object fields, e.g. child.prop1 } I guess everything with a dot in the index does not work. | |||
| Comment by Deyan Petrov [ 13/Aug/21 ] | |||
|
It seems that the script does not support indexes on array fields, e.g. if you have an array trxs in your document and you want to filter on trxs.extRef for example ...
{ trxs: [ { "extRef": "bla" } ] } | |||
| Comment by Deyan Petrov [ 13/Aug/21 ] | |||
|
Getting "We are unauthorized to access xxxxxxxxxxx. Skipping collection.." where xxxxxxxxxxx is the collection name. My user is atlasAdmin@atlas ... what could be the issue?
The underlying error is
| |||
| Comment by Jonathan Streets (Inactive) [ 28/Jul/21 ] | |||
|
Author: {'name': 'Henrik Edin', 'email': 'henrik.edin@mongodb.com', 'username': 'henrikedin'}Message: Revert " This reverts commit c5ac2eb1ea145693e1c6b974e88a2cfc18780134. | |||
| Comment by Jonathan Streets (Inactive) [ 28/Jul/21 ] | |||
|
Author: {'name': 'Henrik Edin', 'email': 'henrik.edin@mongodb.com', 'username': 'henrikedin'}Message: Revert " This reverts commit ae2da27652e552f101559466d165b82a3c122d71. | |||
| Comment by Jonathan Streets (Inactive) [ 28/Jul/21 ] | |||
|
Author: {'name': 'Henrik Edin', 'email': 'henrik.edin@mongodb.com', 'username': 'henrikedin'}Message: Revert " This reverts commit 297e2977ef3e394e02d61aedc954c9aaadc37e73.
|