Loading...

XML

Word

Printable

JSON

Type: Improvement
Resolution: Unresolved
Priority: Major - P3
Fix Version/s: None
Affects Version/s: 4.4.0
Component/s: Aggregation Framework, Performance
Labels:
- qopt-team

Assigned Teams:

Query Optimization
Confidence Status:
None
Work Order:
3
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

Today we have two separate optimized index scans: COUNT_SCAN and DISTINCT_SCAN. COUNT_SCAN will return simple sentinel values as it scans the index, avoiding the cost of translating the index key format to the format the query plan needs/understands. DISTINCT_SCAN can seek over large sections of the index where the values are identical, but still materializes an object outside the index key for consumption by the query plan. These two optimizations could be combined in the case of a query like

// Assume index {value1: 1, value2: 1, value3: 1} exists.
collection.aggregate([
{ $match: { 
    value1: 1, 
    value2: { $gte: new Date(1000) }
}},
{ $group: { _id: "$value3" } },
{ $count: "distinct" } // field name here doesn't matter
])

This would lead to better performance, unclear how much.

related to

SERVER-12159 Distinct Count support in Agg pipeline

Closed

Assignee:: [DO NOT USE] Backlog - Query Optimization
Reporter:: Charlie Swanson
Participants:: [DO NOT USE] Backlog - Query Optimization, Charlie Swanson
Votes:: 0 Vote for this issue
Watchers:: 5 Start watching this issue

Created:: May 12 2020 10:35:27 PM UTC
Updated:: Dec 06 2022 02:27:09 AM UTC

Details

Description

Attachments

Issue Links

Activity

People

Dates