[SERVER-69359] Aggregate query bails on DISTINCT_SCAN and uses IXSCAN Created: 01/Sep/22 Updated: 21/Sep/22 Resolved: 21/Sep/22 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Aggregation Framework |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Terence Honles | Assignee: | Asya Kamsky |
| Resolution: | Won't Do | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||
| Operating System: | ALL | ||||||||||||||||
| Participants: | |||||||||||||||||
| Description |
|
I have a query which I believe can still be satisfied by the DISTINCT_SCAN and it ends up taking too long so I'm going to have to issue N distinct queries instead of using the aggregate framework like I expected to work. The scenario is the following: collection with index {one: 1, two: 1} The following query uses a DISTINCT_SCAN as expected (comments inline how to force it to bail):
In our environment there are many thousands of records that are forced to be scanned through, and in the code I had written I was previously using $in and I thought that might have been the problem, but using $or instead still had the same issue. I did narrow it down to using the index in reverse order, and that means it would effectively be solved if I could use $last as described in https://jira.mongodb.org/browse/SERVER-37304 but since this is specifically using $first it seems like there might be a bug in the query planner. |
| Comments |
| Comment by Asya Kamsky [ 21/Sep/22 ] | |||||
|
Looks like your issue was resolved by changing the `$sort` expression in the aggregation. I'm going to close this ticket but feel free to watch SERVER-65159 - when that gets resolved, issue like the one you encountered would also get resolved. | |||||
| Comment by Asya Kamsky [ 21/Sep/22 ] | |||||
|
This is related to SERVER-65159 which is tracking allowing compound index to satisfy sort order that's not exact match to it. | |||||
| Comment by Terence Honles [ 12/Sep/22 ] | |||||
|
Interesting, I had seen that but for some reason I didn't think of that being an issue because I don't care about the order and was merely trying to hint what index to use and I needed only the second field in reverse order. I can check and confirm that specifying both fields in reverse order works as you suggest. | |||||
| Comment by Yuan Fang [ 12/Sep/22 ] | |||||
|
Hi terence@nexleaf.org , Thank you for your report. If I understand correctly, the query failed when it was trying to sort on:
and succeed when sorting on:
I believe this is due to sorting on multiple fields does not match the index key pattern:
For more information, see MongoDB doc: sort-on-multiple-fields. Based on the configuration you described, the collection was set with a compound index: {one: 1, two: 1 }, thus it can support a sort on { one: 1, two: 1 } and { one: -1, two: -1 } but not on { one: 1, two: -1 } or { one: -1, two: 1 }. Failing on following this rule may result in poor query performance just like what you observed. To check for the usage of the indexes in aggregation, it would be helpful to run explain results for the aggregate operation:
In order for us to fully diagnose the problem, I've created a secure upload portal for you. Files uploaded to this portal are hosted on Box, are visible only to MongoDB employees, and are routinely deleted after some time. For each node in the replica set spanning a time period that includes the incident, would you please archive (tar or zip) and upload it to that link:
Regards, Yuan
|