-
Type: Improvement
-
Resolution: Unresolved
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
None
-
Query Optimization
-
(copied to CRM)
Given an index on {X:1, Y:1, Z:1, D:1} and a distinct() query which has filters on X, Y, and Z, and requests a distinct on D, we currently will not use DISTINCT_SCAN if D is multikey. SERVER-28952 explains why this was done.
I believe there is a special case where we can actually use DISTINCT_SCAN when D is multikey:
-D has [MinKey, MaxKey] bounds.
AND one of the following is true:
1) None of X, Y,Z share a multikey path prefix with D.
2) All of X,Y,Z which do share a multikey path prefix with D have [MinKey,MaxKey] bounds.
For example, assuming D has [MinKey, MaxKey] bounds:
If X is 'a.b' and D is 'c.d' then the optimization can be done.
If X is 'a.b' and D is 'a.c' and 'a' is not multikey, the optimization can be done.
If X is 'a.b' and D is 'a.c' and 'a' is multikey and the bounds on 'a.b' are not [MinKey, MaxKey] the optimization cannot be done.
These are the same conditions we use for determining whether a multikey index can provide a sort. See this code for a more detailed explanation.
- is duplicated by
-
SERVER-90524 Distinct command should use index when distinct field is array sub-field and query is used
- Closed
-
SERVER-95427 IXScan chosen over distinct scan for multikey query
- Closed