[SERVER-29326] count query with null criterion can't be covered by non-sparse index Created: 23/May/17 Updated: 21/Jun/17 Resolved: 23/May/17 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Querying |
| Affects Version/s: | 3.2.6, 3.2.12 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Kay Agahd | Assignee: | Mark Agarunov |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Operating System: | ALL | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Steps To Reproduce: |
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Participants: | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Description |
|
Count is unable to cover queries which are using null as criterion even though an appropriate non-sparse index exists. The problem is that such a query slows down the whole database because lot of data needs to be read from disk (75-100 MB/sec while the query is running). The used index in production is about 5 GB on each mongodb node. The total size of all indexes in production is 32 GB and fit perfectly in RAM since each node has 128 GB RAM. I broke down the problem to a minimalistic, unsharded setup. I inserted the following types of documents:
I created the non-sparse index {shopId:1, missingSince:1}. The execution plan of the query count({"shopId":1, "missingSince":null}) indicated "totalDocsExamined":12 which means that 12 documents had to be fetched. These must be the 5 documents of point 2 plus the 7 documents of point 3. All these 12 documents should be in the index with shopId:1, missingSince:null, thus satisfying the query. |
| Comments |
| Comment by Asya Kamsky [ 24/May/17 ] |
|
You are correct that for count queries or other queries which don't need to return the indexed field that is either null or missing the FETCH is unnecessary and the ticket this one was marked a duplicate of (SERVER-18861) actually tracks improvements of current index use for null equality, I agree that the ticket description is not particularly clear about that. Its counterpart for $ne null queries is We will try to clarify the description of these tickets to make it more clear which one tracks which work. The main reason I brought up SERVER-12869 is that we have been seriously considering implementing this improvement and when we do, the work to improve performance of "old" indexes will be unnecessary, so we want to make sure that we schedule work that will improve things for the largest number of cases while staying relevant for multiple future releases. |
| Comment by Kay Agahd [ 24/May/17 ] |
|
Thank you for jumping in asya, very much appreciated! |
| Comment by Asya Kamsky [ 24/May/17 ] |
|
There is a design issue underlying this. MongoDB query language semantics dictate that {x:null} must match both documents where x is not present, and ones where x is present and equal to null. On the other hand, in the indexes, the value null represents both, documents with field absent and documents with field set to null. To resolve this will likely require work captured in SERVER-12869 which would be a major undertaking requiring a new index version and has to be scheduled against all the other large projects that the query team has in the backlog. Regards, |
| Comment by Kay Agahd [ 24/May/17 ] |
|
Thanks Mark.Agarunov but I'd rather say "has to do with" or "depends on" instead of "is duplicated" because this issue is for combined indexes and not single field indexes as SERVER-18861 . If you are aware of such a critical bug you should at least update mongodb's documentation! Not only because it has taken us and probably others many hours to trace down this issue but also because if we knew that queries with null predicates never can be covered we would have chosen another schema design! |
| Comment by Mark Agarunov [ 23/May/17 ] |
|
Hello kay.agahd@idealo.de, Thank you for the report. Looking over the output you've provided, this behavior appears to be the same issue as described in SERVER-18861. I have closed this ticket as a duplicate, so please follow SERVER-18861 for any updates on this issue. Thanks, |