Loading...

XML

Word

Printable

JSON

Type: Improvement
Resolution: Unresolved
Priority: Major - P3
Fix Version/s: None
Affects Version/s: None
Component/s: None
Labels:
None

Assigned Teams:

Query Optimization
Case:
Confidence Status:
None
Work Order:
3
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

Given an index on {X:1, Y:1, Z:1, D:1} and a distinct() query which has filters on X, Y, and Z, and requests a distinct on D, we currently will not use DISTINCT_SCAN if D is multikey. ~~SERVER-28952~~ explains why this was done.

I believe there is a special case where we can actually use DISTINCT_SCAN when D is multikey:

-D has [MinKey, MaxKey] bounds.

AND one of the following is true:

1) None of X, Y,Z share a multikey path prefix with D.

2) All of X,Y,Z which do share a multikey path prefix with D have [MinKey,MaxKey] bounds.

For example, assuming D has [MinKey, MaxKey] bounds:

If X is 'a.b' and D is 'c.d' then the optimization can be done.

If X is 'a.b' and D is 'a.c' and 'a' is not multikey, the optimization can be done.

If X is 'a.b' and D is 'a.c' and 'a' is multikey and the bounds on 'a.b' are not [MinKey, MaxKey] the optimization cannot be done.

These are the same conditions we use for determining whether a multikey index can provide a sort. See this code for a more detailed explanation.

is duplicated by

SERVER-90524 Distinct command should use index when distinct field is array sub-field and query is used

Closed

SERVER-95427 IXScan chosen over distinct scan for multikey query

Closed

Assignee:: Unassigned
Reporter:: Ian Boros
Participants:: Daniel Gottlieb, Ian Boros
Votes:: 2 Vote for this issue
Watchers:: 15 Start watching this issue

Created:: Aug 12 2021 04:37:13 PM UTC
Updated:: Feb 12 2025 02:06:19 PM UTC

Details

Description

Attachments

Issue Links

Activity

People

Dates