[SERVER-38949] Incorrect index bounds for {$ne: ["String"]} query Created: 11/Jan/19 Updated: 29/Oct/23 Resolved: 14/Feb/19 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Querying |
| Affects Version/s: | None |
| Fix Version/s: | 4.0.9, 4.1.9 |
| Type: | Bug | Priority: | Critical - P2 |
| Reporter: | Charlie Swanson | Assignee: | Ian Boros |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | afz | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||||||||||
| Operating System: | ALL | ||||||||||||||||||||||||
| Backport Requested: |
v4.0, v3.6, v3.4, v3.2
|
||||||||||||||||||||||||
| Steps To Reproduce: | Saving this file as repro.js:
You can reproduce this failure with
|
||||||||||||||||||||||||
| Sprint: | Query 2019-02-25 | ||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||
| Case: | (copied to CRM) | ||||||||||||||||||||||||
| Linked BF Score: | 24 | ||||||||||||||||||||||||
| Description |
|
For a query like
a collection scan will return the document
because "String" != ["String"] and so "String" is not in the array [["String"]]. However, if you build an index on {x: 1} then the above document is not returned. Using explain, you can see the index bounds are:
Compare this to what should be the equivalent query
we then get the bounds:
It looks like the index bounds building logic for an $in within a $not. |
| Comments |
| Comment by Tom Scott [ 17/Apr/19 ] | |||||||||||||||||||||||||
| Comment by Eric Sedor [ 16/Apr/19 ] | |||||||||||||||||||||||||
|
tubbo in this case can you please open a new ticket to describe the issue you're experiencing? Thanks in advance! | |||||||||||||||||||||||||
| Comment by Tom Scott [ 16/Apr/19 ] | |||||||||||||||||||||||||
|
We had some issues with this when updating to MongoDB 4.0.9 (actually, our CI updated to 4.0.9 and locally we were still using 4.0.6, so it took a minute to reproduce the actual error) when using the notablescan configuration setting (which is turned on for all our tests and in production). We received an error when trying to execute a $nin query that had an index, because apparently these types of queries are no longer indexable? If that's the case, might I suggest making $nin queries exempt from the error generated by notablescan? Not sure if I should make a new issue or if it's OK to comment here? | |||||||||||||||||||||||||
| Comment by Githook User [ 03/Apr/19 ] | |||||||||||||||||||||||||
|
Author: {'email': 'puppyofkosh@gmail.com', 'name': 'Ian Boros', 'username': 'puppyofkosh'}Message: | |||||||||||||||||||||||||
| Comment by Githook User [ 14/Feb/19 ] | |||||||||||||||||||||||||
|
Author: {'name': 'Ian Boros', 'email': 'puppyofkosh@gmail.com', 'username': 'puppyofkosh'}Message: | |||||||||||||||||||||||||
| Comment by James Wahlin [ 18/Jan/19 ] | |||||||||||||||||||||||||
|
We currently consider indexes as incompatible for negation for operators like $mod and $type in QueryPlannerIXSelect::_compatible. A quick fix for this issue would be to add a clause for negation of equality to array. We discussed alternatives to this however which include:
In addition to the above, charlie.swanson suggested we could explore adding an invariant that we have exact bounds, at the point where we complement bounds for the not case in the index bounds builder to help catch other cases where we may be generating incorrect bounds. Along with this we could consider performing an audit to see if there are other cases we suspect are not handled correctly. | |||||||||||||||||||||||||
| Comment by Charlie Swanson [ 11/Jan/19 ] | |||||||||||||||||||||||||
|
Ok I've narrowed this down a bit and I think I've found the problem. If we're dealing with a NOT inside of the index bounds builder, then we first build bounds for the inner predicate and then invert them. This usually works fine for when the inner predicate is an equality predicate, but doesn't work when that equality predicate is comparing to an array. This is because when comparing to array, we cannot build exact bounds for the query: see the bounds building logic here. Because the bounds for the equality are INEXACT_FETCH, this means that there will be some values in those index bounds which do not match the equality predicate. Such values will be filtered out by the FETCH. If we invert those bounds then, the index scan will never even see the values which would have been filtered out by the FETCH. It's incorrect to skip these values on inversion for NOT, since they should match the NOT. This very simple and hacky patch fixes the reproducer:
|