[SERVER-13197] Tighten index bounds and allow compound index to be chosen when predicate on leading field is not provided Created: 14/Mar/14 Updated: 06/Dec/22 |
|
| Status: | Backlog |
| Project: | Core Server |
| Component/s: | Querying |
| Affects Version/s: | 2.6.0 |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Neil Lunn | Assignee: | Backlog - Query Optimization |
| Resolution: | Unresolved | Votes: | 8 |
| Labels: | query, query-44-grooming | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
||||||||||||||||||||||||||||||||||||
| Issue Links: |
|
||||||||||||||||||||||||||||||||||||
| Assigned Teams: |
Query Optimization
|
||||||||||||||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||||||||||||||
| Case: | (copied to CRM) | ||||||||||||||||||||||||||||||||||||
| Description |
|
Updated description: For a compound index {a: 1, b: 1}, a range query {b: {$gt: 10, $lt: 20}} (using a hint on the compound index) produces the following index bounds:
Bounds could be tighter under 2.6. As a workaround for 2.6, consider adding the leading field to the query.
--------
The range shown in the index bound comes from the range on the query, and is the second element in the compound key. The same data in 2.6.0-rc0 and rc1 produces this:
So the range is now gone, nscanned is up to the full size of the collection as also has the execution time increased as a result of the scan. As stated, the matching range is on the second element of the compound key, but if this is intentional why does the prior version match the same number of documents and in considerably less time? |
| Comments |
| Comment by Neil Lunn [ 18/Apr/14 ] | ||||
|
Glad that this is cleared up. I actually did become aware that specifying the MinKey and MaxKey values in the query would correct the behavior in the optimizer a few days ago, and then only after an exhaustive search of the new Index interface code. The usage of the hint in the example was actually just a simulation of the intended sort operation as referenced in this question that was asked on stackoverflow http://stackoverflow.com/questions/22276988/mongodb-how-does-it-avoid-full-collection-scan So the point was to explain that the optimizer should be using the bounds when specified as such, yet this led to a false call of a bug in the 2.4 release, so hence the submission here. So I am glad that this confirms that the bounds should be used, and that use of MinKey and MaxKey should be an acceptable workaround until the previous behavior is restored. Regards, P.S Much better title for the issue as it is very descriptive of the actual problem | ||||
| Comment by Benety Goh [ 17/Apr/14 ] | ||||
|
HI neillunn, Thanks for bringing this to our attention. This is indeed a regression in our index bounds logic. We have been able to reproduce your results and would like suggest a workaround for your application. Instead of specifying a predicate on age and using an explicit hint on the compound index, we'd like to suggest amending the query to include a MinKey, MaxKey range on name:
With the additional predicate on name in the query, the hint is no longer necessary. Regards, | ||||
| Comment by Neil Lunn [ 03/Apr/14 ] | ||||
|
It would be nice to have some kind of progress update on this. Given the extensive re-factoring of the index API this very much "smells" of a regression. A quick perusal of the code seems to indicate that various methods that were present, prior to the re-factor seem have been omitted. But do to the re-factor this is very hard to lock down without adequate unit tests. What seems to be missing is the correct detection of the "IndexBounds" that seemed to exist before. Any comment? | ||||
| Comment by Asya Kamsky [ 14/Mar/14 ] | ||||
|
Note: this requires an explicit hint with the compound index to reproduce. |