[SERVER-27076] Query Planner uses inefficient plan for regular expression + sort with aggregations Created: 16/Nov/16 Updated: 17/Nov/16 Resolved: 17/Nov/16 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Aggregation Framework |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Jacob Jewell | Assignee: | Kelsey Schubert |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
|||||||||||||||||||||
| Operating System: | ALL | |||||||||||||||||||||
| Steps To Reproduce: |
|
|||||||||||||||||||||
| Participants: | ||||||||||||||||||||||
| Description |
|
If a collection has a compound index on a string field + a sort field and another index on just the sort field then an aggregation using a regular expression match will only use the index with just the sort field. This does not happen with the equivalent find command. Adding an index on sort field + string field doesn't improve performance. Changing the location of the sort in the aggregation also has no effect. The reason for both indexes is that some queries need to be able to do a find based just on the sort key. |
| Comments |
| Comment by Charlie Swanson [ 17/Nov/16 ] |
|
Hi jakesjews, Unfortunately that is the only known workaround at this time. Note the problem here is that if your regex happened to match 90% of the collection, then a sort using the {created: 1} index would be the better choice. Only when the result set is small enough does it make sense to favor an in-memory (blocking) sort. Unfortunately, our aggregation optimizer is not sophisticated enough to determine the point where one becomes faster than the other, so it instead always chooses non-blocking, index-provided sorts when they are available. Please vote for |
| Comment by Jacob Jewell [ 16/Nov/16 ] |
|
Sorry about the duplicate issue you can close this one. Is the only way to work around this to add a $project? We have a pattern throughout our web app where users have an optional search textbox for serial numbers above reports. We don't want to hurt performance when the text search is empty so we need the ability to sort collections with or without a text search. Adding a conditional $project to all our queries based on the presence of a wildcard would be a pretty difficult task. |