Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Duplicate
Priority: Major - P3
Fix Version/s: None
Affects Version/s: 6.0.6
Component/s: None
Labels:
None

Assigned Teams:

Query Optimization
Operating System:
ALL
Steps To Reproduce:
Hide

db.foo.drop() for (var cnt = 0; cnt < 10; cnt++) { db.foo.insert({cust: 1, labels: ["Alpha", "Beta"]}) } db.foo.createIndex({cust: 1, labels: 1}) printjson(db.foo.explain('executionStats').distinct('labels', {cust: 1})) # shows IXScan->Fetch with nReturned=10 -- expect a distinct scan print(db.foo.distinct('labels', {cust: 1})) # ['Alpha', 'Beta']
Show
db.foo.drop() for (var cnt = 0; cnt < 10; cnt++) { db.foo.insert({cust: 1, labels: ["Alpha", "Beta"]}) } db.foo.createIndex({cust: 1, labels: 1}) printjson(db.foo.explain('executionStats').distinct('labels', {cust: 1})) # shows IXScan->Fetch with nReturned=10 -- expect a distinct scan print(db.foo.distinct('labels', {cust: 1})) # ['Alpha', 'Beta']
Confidence Status:
None
Work Order:
3

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

Suppose we have a collection with a large number of documents. Where each document has two fields. One is a scalar value and the other is a list of strings.

Suppose there's an index on {scalar: 1, list: 1}. And we perform operations of the form:

 foo.distinct('list', {scalar: <exact match>})

We expect it's a legal optimization to use a distinct scan on the index bounded on the scalar match (despite multikey idiosyncracies).

What we're observing is that mongod instead does an index scan and fetches each matching document to aggregate the distinct "list" values.

I suspect this (lack of) optimization dates back to ~~SERVER-28952~~. IIUC, that ticket describes a correctness problem. But I believe ~~SERVER-28952~~ is a slight variation as its query predicate also depends on the multikey field. But happy to be wrong here and learn that the proposed optimization is in fact not legal for this simpler case.

duplicates

SERVER-59320 Use DISTINCT_SCAN on multikey indexes in special cases

Backlog

related to

SERVER-28952 Multikey indexes should not be eligible for DISTINCT_SCAN if distinct key is an array component

Closed

Assignee:: Unassigned
Reporter:: Daniel Gottlieb
Participants:: Chris Kelly, Daniel Gottlieb
Votes:: 0 Vote for this issue
Watchers:: 11 Start watching this issue

Created:: Oct 01 2024 07:40:25 PM UTC
Updated:: Oct 21 2024 05:54:08 PM UTC
Resolved:: Oct 17 2024 03:15:43 PM UTC

Details

Description

Attachments

Issue Links

Activity

People

Dates