[SERVER-44057] Find on non-indexed and sort on indexed is slower then find on non-indexed and sort on non-indexed Created: 17/Oct/19 Updated: 07/Nov/19 Resolved: 07/Nov/19 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Querying |
| Affects Version/s: | 4.0.13 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor - P4 |
| Reporter: | Sean Morris | Assignee: | Eric Sedor |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Operating System: | ALL | ||||||||
| Steps To Reproduce: | Create a collection with 2 million entries containing 3 fields. Create an index on field C.
Do a find of A and sort on A. Do a find of A and sort on C. The second query is a magnitude slower then the first. |
||||||||
| Participants: | |||||||||
| Description |
|
I have a collection with over 2 million entries in it. I allow find and sort on both indexed and non-indexed fields.
When I do a find on a non-indexed field but sort on an indexed field it is significantly slower which is not what I would expect. I would expect it to be the same or faster then the previous query since it should be using the index for the sort. Here is the query plan
|
| Comments |
| Comment by Eric Sedor [ 23/Oct/19 ] |
|
SERVER-23406 is considered an improvement rather than a bug. Importantly, directly comparing the performance between these queries is misleading because the differing sort makes them different queries, even if the result set is the same. In the case of the above queries, the Query Planner is more at the mercy of certain basic concerns. Consider:
Generally it is best to avoid these concerns by ensuring very good indexes for queries, such as {lifecycleState:1, repId:1} for the latter. Does this make sense? For further assistance reasoning about query performance, I encourage you to ask our community by posting on the mongodb-user group or on Stack Overflow with the mongodb tag. Gratefully, |
| Comment by Sean Morris [ 22/Oct/19 ] |
|
I see that SERVER-23406 has been open for awhile. Is this a confirmed bug and just not resolved yet, or is it considered to be working as expected? Thanks, Sean |
| Comment by David Storch [ 22/Oct/19 ] |
|
eric.sedor sean.morris@xyleminc.com, at a cursory glance it appears that this may be the same issue as SERVER-23406. The fast plan is a collection scan followed by a blocking SORT. The slow plan has to scan an entire index. Although the fast plan needs to perform a blocking SORT operation and the slow one does not, the execution stats show that we only need to sort 111 matching documents, which should be pretty cheap. So the difference in performance appears to be coming primarily from the fact that scanning the whole collection is faster than scanning the entire {"repId": 1} index. SERVER-23406 has additional technical details about this issue. |
| Comment by Sean Morris [ 18/Oct/19 ] |
|
Here is the first query: db.getCollection('radios').find({"lifecycleState":"INVENTORY"}).sort({"lifecycleState":1}) Here is the second one: db.getCollection('radios').find({"lifecycleState":"INVENTORY"}).sort({"repId":1})
There is no index on lifecycleState. The index on repId is unique and ascending.
|
| Comment by Eric Sedor [ 17/Oct/19 ] |
|
Hi sean.morris@xyleminc.com, thanks for this report. To help us interpret the explain output you're seeing and assist us in attempting an internal reproduction, could you please provide the exact operation you ran to produce these explain plans? Gratefully, |