-
Type: Bug
-
Resolution: Duplicate
-
Priority: Major - P3
-
None
-
Affects Version/s: 6.0.6
-
Component/s: None
-
None
-
Query Optimization
-
ALL
-
Suppose we have a collection with a large number of documents. Where each document has two fields. One is a scalar value and the other is a list of strings.
Suppose there's an index on {scalar: 1, list: 1}. And we perform operations of the form:
foo.distinct('list', {scalar: <exact match>})
We expect it's a legal optimization to use a distinct scan on the index bounded on the scalar match (despite multikey idiosyncracies).
What we're observing is that mongod instead does an index scan and fetches each matching document to aggregate the distinct "list" values.
I suspect this (lack of) optimization dates back to SERVER-28952. IIUC, that ticket describes a correctness problem. But I believe SERVER-28952 is a slight variation as its query predicate also depends on the multikey field. But happy to be wrong here and learn that the proposed optimization is in fact not legal for this simpler case.
- duplicates
-
SERVER-59320 Use DISTINCT_SCAN on multikey indexes in special cases
- Backlog
- related to
-
SERVER-28952 Multikey indexes should not be eligible for DISTINCT_SCAN if distinct key is an array component
- Closed