[SERVER-22574] Allow COUNT stage to sum COUNT_SCANS for queries with $in operator Created: 11/Feb/16 Updated: 06/Dec/22 Resolved: 17/Feb/16 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Querying |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Vitaly Puzrin | Assignee: | Backlog - Query Team (Inactive) |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Assigned Teams: |
Query
|
||||||||
| Participants: | |||||||||
| Description |
|
Count over covered index happens too slow if $in condition exists. Example:
takes ~50ms with 20000 keys examined, while separate queries takes 1 & 0 ms:
Not we split such $in queries to mutiple, but: 1. That's not convenient |
| Comments |
| Comment by Kelsey Schubert [ 17/Feb/16 ] | |||
|
Hi vitaly, After review, the work described in this ticket is already encapsulated in SERVER-17266, so I will be closing this a duplicate. Please free to vote for it and watch it for updates. Regarding your second note, I would recommend posting to the slow query to mongodb-users group with more details about your schema, indexes, and use case. The user group provides MongoDB-related support and may suggest modifications to improve the performance you are seeing. Kind regards, | |||
| Comment by Vitaly Puzrin [ 12/Feb/16 ] | |||
|
Tomas, i have 2 notes to your comment: Note 1
Is a bit faster than with $in [1, 2], but still much slower than strict condition:
Use case: count number of visible topics in forum section before specific date/_id. So, to make optimization of { y: $in [1,2] } useful, it's speed should be comparable with strict condition { y: 1 }, not with current single { y: $in [1] }. I mean, such $in should not be considered as interval. Note 2 Consider another case with $in, that needs improvement - "pagination" query (also with index for IXSCAN):
It suffers with the same performance problem, but can not be splitted manually as count. Replace $in with $or does not help much. I understand that skip+limit is slow by nature, but mongo's performance is significantly slower than expected (compared with mysql). Is it possible to generalize request to cover this case too (or create a new one)? | |||
| Comment by Kelsey Schubert [ 12/Feb/16 ] | |||
|
Hi vitaly, Thank you for opening this ticket. The behavior you are observing is expected: the separate queries are able to take advantage of optimizations in the COUNT_SCAN stage to complete more rapidly. One of the current requirements of COUNT_SCAN is that index bounds must be a single interval. The following query is able to take advantage of the COUNT_SCAN stage which utilizes a number of optimizations over INDEX_SCAN since it has only one index bound interval.
Whereas this query:
is composed of two discrete index bounds: [1.0, 1.0] and [2.0, 2.0], and consequently the COUNT_SCAN stage is not selected. You may want to consider whether the following query would provide the same results as the query with the $in operator.
Since the index bound is a single interval it is able to take advantage of the COUNT_SCAN optimizations. This workaround would be dependent on your particular use case. I am modifying to the ticket summary to highlight the specific improvement request and sending it to be scheduled during the next round of planning. Please continue to watch this ticket for updates. Thank you, |