[SERVER-56865] explodeForSort logic potentially suboptimal when multikey field(s) follow sort fields in the index Created: 11/May/21 Updated: 31/Oct/23 |
|
| Status: | Backlog |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | 4.2.14, 4.4.6 |
| Fix Version/s: | None |
| Type: | Task | Priority: | Major - P3 |
| Reporter: | Chris Harris | Assignee: | Backlog - Query Optimization |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Assigned Teams: |
Query Optimization
|
| Participants: | |
| Case: | (copied to CRM) |
| Description |
|
Consider the query:
If following the ESR Guidance, one would create the following index:
This works fine and produces a SORT_MERGE plan when the index is not multikey:
Making the index multikey on the trailing range predicate converts the plan into one that does NOT utilize the SORT_MERGE stage and instead performs a blocking SORT:
This behavior does not occur when the array/multikey field is one of the prefixed equality conditions:
The problem appears to be related to how we examine the index key patterns to determine if they provide the desired functionality. More specifically, it looks like there is an assumption made when identifying portion of the index that provides the sort and checking it for multikeyness. The associated comments in the code say:
The assumption that all of the remaining fields are relevant to the sort is not necessarily correct. In the example above, it is only the s field in the index that is associated with the sort. The r field after that is causing the code to exit as it is a multikey field, but since it is NOT part of the sort the fact that it is an array should not violate any constraints on sorting. Investigated using "version" : "4.5.0-7162-g7581854". |
| Comments |
| Comment by Diego Rodriguez (Inactive) [ 01/Jul/21 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I have been able to reproduce this in the latest v4.2 and v4.4 builds: 4.2.14 and 4.4.6 respectively. Sample Data:
Query:
Tested Indexes:
Out of the three indexes only the last two produce a SORT_MERGE plan. The first one, which includes a multi key path on _rperm, produces a blocking sort. Regards | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Chris Harris [ 12/May/21 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
It occurred to me after writing the description text that the problem is independent of the query actually using the trailing key(s) from the index:
The only requirement is that there are fields with multikey paths present after the (full) sort definition. I'll update the title of the ticket accordingly, but leaving the description of the case as-is for now since it describes a particular example of when the problem might be encountered. |