[SERVER-18653] Answering "equality to null" predicates with an index requires a FETCH stage Created: 26/May/15 Updated: 06/Dec/22 Resolved: 03/Jun/15 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Querying |
| Affects Version/s: | 3.0.3 |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Andrey Hohutkin | Assignee: | Backlog - Query Optimization |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||
| Assigned Teams: |
Query Optimization
|
||||||||||||||||
| Participants: | |||||||||||||||||
| Description |
|
I have a primary collection users: { _id: ObjectId, follow: [ObjectId], primary: ObjectId }primary field is optional in document When I try to query all followers without primary: ) Here is a result from explain: }, , } It shows 'filter' stage that absolutely not needed. I checked that if I put primary as 0 - it works fine. |
| Comments |
| Comment by mils [ 13/Nov/17 ] | ||||||||||||
|
Hi, So firstly thank you both to andrey and @rassi for the super-useful information. It has been vital to my db performance. Simply put, I looked through the 2.6 changelist and this is the only documentation on why the null-semantic change was made: > null equality conditions on array elements (e.g. "a.b": null) no longer match document missing the nested field a.b (e.g. a: [ 2, 3 ]) @rassi can you please provide any more info on why this used to be indexed, and is no longer indexed? I think a lot of mongodb users out there would gain confidence in the engine if they had a better understanding of why the change was made. Thanks again, | ||||||||||||
| Comment by Andrey Hohutkin [ 08/Jun/15 ] | ||||||||||||
|
Because nobody replied to my comment I will create a ticket with a feature request. | ||||||||||||
| Comment by Andrey Hohutkin [ 05/Jun/15 ] | ||||||||||||
|
I want to show you that this issue has "bad design" and I will argue that. "null" is not reserved word or value for mongodb. It is value that used in documents as well as a lot of others like Number, String or ObjectId. So I can freely to use "null" value in db and expect that I can use full power of engine. Basing on this fact, I give you example that contains contradiction in it. Now I want to explain my opinion about your explanation how it works now. I think, If a collection already has an index on specified field it should be ready for "any" case of data. Otherwise it is just takes a resources and does nothing. What is your opinion and what others think of it? Thanks for attention. | ||||||||||||
| Comment by J Rassi [ 04/Jun/15 ] | ||||||||||||
|
I will attempt to clearly spell out my previous explanation:
Index keys don't have the field name attached, so the key is indeed {"": null}.
I was referring to the second bullet in this list, not the third bullet. The second bullet reads as follows: "null equality conditions on array elements (e.g. 'a.b': null) no longer match document missing the nested field a.b (e.g. a: [ 2, 3 ])."
As I mentioned above, this query is using an index, however the query cannot be covered by this index. | ||||||||||||
| Comment by Andrey Hohutkin [ 04/Jun/15 ] | ||||||||||||
|
Hi Jason! I do not agree with you that it "Works as Designed". Now I want to know how your explanation is related to my issue. According to spec "undefined" is deprecated. IMHO, this is ridiculous, that a simple and clear issue not assumed as a bug. Please, reopen the issue or let someone else to say his opinion. P.S. generates the index key {"": null}." | ||||||||||||
| Comment by J Rassi [ 03/Jun/15 ] | ||||||||||||
|
Hi Andrey, I've investigated further, and determined that this was in fact not a regression introduced in the 2.6 query engine rewrite, but instead was an intentional behavior change. Version 2.6.0 of the server changed the semantics of a null equality match predicate, such that the document {a: []} was no longer considered a match for the query predicate {"a.b": null} (in prior versions of the server, this document was considered a match for this predicate). This is documented in the 2.6 compatibility notes, under the "null comparison" section. For an index with key pattern {"a.b": 1}, this document {a: []} generates the index key {"": null}. Other documents like {a: null} and the empty document {} also generate the index key {"": null}. As a result, if a query with predicate {"a.b": null} uses this index, the query system cannot tell just from the index key {"": null} whether or not the associated document matches the predicate. As a result, INEXACT_FETCH bounds are assigned instead of EXACT bounds, and hence a FETCH stage is added to the query execution tree. As such, the existing code is correct, and I will close this ticket with resolution "Works as Designed". If you require an index-only count for this operation, you must implement a workaround in your application (for example, you could change your schema such that followers without a primary are represented by "primary: false" instead of "primary: null"). ~ Jason Rassi | ||||||||||||
| Comment by Andrey Hohutkin [ 30/May/15 ] | ||||||||||||
|
I think I found a place with a bug.
| ||||||||||||
| Comment by J Rassi [ 26/May/15 ] | ||||||||||||
|
Hi, Thanks for the report. I can confirm that this count operation should be able to use a covered scan on this index, but it does not. This appears to be a regression introduced by the query engine rewrite for the 2.6 release. I can see that predicates of the form {a: null} (or {a: {$eq: null}}, equivalently) generate "inexact bounds" in cases where "exact bounds" could be generated. This means that query plans in which these predicates are indexed will unnecessarily cause the document to be fetched and re-checked against the predicate (note that the index is indeed being used, but the query is not being covered by the index scan). I'm updating the summary and moving this ticket to the "Needs Triage" state; please continue to watch this ticket for updates on when a fix may be scheduled. ~ Jason Rassi |