[SERVER-44050] Arrays along 'hashed' index key path are not correctly rejected Created: 16/Oct/19 Updated: 29/Oct/23 Resolved: 28/Oct/19 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Querying |
| Affects Version/s: | 4.2.1 |
| Fix Version/s: | 3.6.15, 4.3.1, 3.4.24, 4.2.2, 4.0.14 |
| Type: | Bug | Priority: | Critical - P2 |
| Reporter: | David Storch | Assignee: | Arun Banala |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||
| Backwards Compatibility: | Minor Change | ||||||||||||||||||||
| Operating System: | ALL | ||||||||||||||||||||
| Backport Requested: |
v4.2, v4.0, v3.6, v3.4
|
||||||||||||||||||||
| Sprint: | Query 2019-11-04 | ||||||||||||||||||||
| Participants: | |||||||||||||||||||||
| Linked BF Score: | 20 | ||||||||||||||||||||
| Description |
|
Issue Status as of Nov 25, 2019 ISSUE SUMMARY For example, a hashed index {"a.b": "hashed"} would incorrectly index documents having array at "a", instead of throwing an error and rejecting the write operation. Hashed indexes are typically only used to support a shard key, and validation on mongos prevents these invalid documents from being inserted or created via an update. But there are still plausible cases in which corruption of the hashed index may have occurred:
Users running on a sharded cluster who created their hashed index on an empty collection and who have not bypassed mongoS to write documents directly to a shard will not be affected by this issue. USER IMPACT RECOVERY STEPS To address the existing corruption, users will need to either delete all the illegal documents or update them such that the resulting documents no longer have an array at any point along the index path. Users can find documents which may have an illegal array using a {$type: 'array'} predicate. The documents identified by the {$type: 'array'} query should then be deleted or updated by _id. Note that users can only update a shard key value on version 4.2. For 4.0 and older versions, users will have to delete the documents. Following deletion, the documents may be reformatted to eliminate the illegal array paths and then re-inserted. AFFECTED VERSIONS FIX VERSION Original DescriptionCreating a hashed index on a field imposes the constraint that this field cannot contain an array:
This constraint is not correctly enforced if the hashed index is against a dotted field, and the array is present mid-path in the to-be-indexed document:
The key generation implementation calls dotted_path_support::extractElementAtPath(), which returns an empty BSONElement if there is an array along the path. In downstream code, this empty BSONElement causes us to insert a null key into the index. The result is a corrupt index that can lead to missing query results:
Note that we get the correct query result only after dropping the corrupt index. Although this is both an index corruption and a query correctness issue, the issue cannot be encountered when the hashed index is supporting the shard key – shard key fields cannot be arrays. The primary use case for hashed indexes is hashed sharding, so this may be an uncommon issue for hashed indexes that exist in the wild. I have only tested 4.2.0 and a recent version of master, but I suspect that this bug affects all stable versions. The incorrect key generation code has not been substantially altered recently. |
| Comments |
| Comment by Githook User [ 30/Oct/19 ] |
|
Author: {'username': 'banarun', 'email': 'arun.banala@10gen.com', 'name': 'Arun Banala'}Message: (cherry picked from commit 888f7e6fc10ccb999be203b8cbad4dbe19d0a5d2) |
| Comment by Githook User [ 30/Oct/19 ] |
|
Author: {'name': 'Arun Banala', 'username': 'banarun', 'email': 'arun.banala@10gen.com'}Message: (cherry picked from commit 888f7e6fc10ccb999be203b8cbad4dbe19d0a5d2) |
| Comment by Githook User [ 29/Oct/19 ] |
|
Author: {'name': 'Arun Banala', 'username': 'banarun', 'email': 'arun.banala@10gen.com'}Message: (cherry picked from commit 888f7e6fc10ccb999be203b8cbad4dbe19d0a5d2) |
| Comment by Githook User [ 29/Oct/19 ] |
|
Author: {'name': 'Arun Banala', 'username': 'banarun', 'email': 'arun.banala@10gen.com'}Message: (cherry picked from commit 888f7e6fc10ccb999be203b8cbad4dbe19d0a5d2) |
| Comment by Githook User [ 25/Oct/19 ] |
|
Author: {'name': 'Arun Banala', 'username': 'banarun', 'email': 'arun.banala@10gen.com'}Message: |