[SERVER-53626] Minimize index scanning when retrieving distinct values grouped by more than one field Created: 06/Jan/21 Updated: 21/Jan/23 |
|
| Status: | Backlog |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Chris Harris | Assignee: | Backlog - Query Optimization |
| Resolution: | Unresolved | Votes: | 1 |
| Labels: | indexv3 | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||||||||||||||
| Assigned Teams: |
Query Optimization
|
||||||||||||||||||||||||||||||||
| Sprint: | QO 2021-10-04, QO 2021-10-18 | ||||||||||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||||||||||
| Description |
|
MongoDB 4.2 introduced the ability to avoid full index scans for aggregation pipelines that include a specific type of $group stage - those that logically request distinct information which can be obtained by scanning a single document. This was initially implemented via As confirmed by this comment from a related enhancement in |
| Comments |
| Comment by Katherine Wu (Inactive) [ 26/Oct/21 ] | ||||
|
The proposed change in SERVER-21992 is different from WIP patch: https://github.com/10gen/mongo/tree/kaywux/SERVER-53626 | ||||
| Comment by Katherine Wu (Inactive) [ 25/Oct/21 ] | ||||
|
The distinct command (and our current version of the index) conflates the null and missing values. However, $group with a mult-field document _id currently distinguishes between these two values. Without SERVER-21992, this optimization would give us different results. | ||||
| Comment by Ruslan Abdulkhalikov (Inactive) [ 23/Apr/21 ] | ||||
|
As part of this ticket, it would be nice to handle a case of a key based on an array of a single index key:
the case with an object works fine:
|