Loading...

XML

Word

Printable

JSON

Type: Task
Resolution: Duplicate
Priority: Major - P3
Fix Version/s: None
Affects Version/s: None
Component/s: None
Labels:
None

Assigned Teams:

Query Optimization
Sprint:
QO 2025-02-03
Confidence Status:
None
Work Order:
3
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

Estimation of $exists doesn't take into account the count of missing values and can produce incorrect results.

In the following test case there is a filed with 90% of the values missing:

const collName = 'wrong_exists_estimate';
const coll = db[collName];
coll.drop();
coll.insertMany(Array.from({length: 500}, (_, i) => {
    const doc = {a: 1};
    if (i % 9 === 0) {
        doc.missing_90_percent = i;
    }
    return doc;
}));
coll.insertMany(Array.from({length: 500}, (_, i) => {
    const doc = {a: i};
    if (i % 9 === 0) {
        doc.missing_90_percent = i % 3;
    }
    return doc;
}));
coll.createIndexes([{missing_90_percent: 1}]);
coll.runCommand({analyze: collName, key: "missing_90_percent", numberBuckets: 10});
db.adminCommand({setParameter: 1, planRankerMode: "histogramCE"});
coll.find({ "missing_90_percent" : { "$exists" : false } }).explain();
coll.find({ "missing_90_percent" : { "$exists" : true  } }).explain();

Index scan of the first query is estimated as 888 documents, while index scan of the second query is estimated as 1000 (the whole collection). The second result is clearly wrong. The reason is that the bounds for the second query are

indexBounds: { missing_90_percent: [ '[MinKey, MaxKey]' ] }

These bounds are estimated as fetching the whole collection but this is incorrect because they should take into account the number of missing values.

duplicates

SERVER-99029 Histogram CE for AND/OR trees of all sargable ME nodes

Closed

Assignee:: Timour Katchaounov
Reporter:: Timour Katchaounov
Participants:: Timour Katchaounov
Votes:: 0 Vote for this issue
Watchers:: 5 Start watching this issue

Created:: Jan 15 2025 01:29:44 PM UTC
Updated:: Jan 23 2025 10:01:49 AM UTC
Resolved:: Jan 23 2025 10:01:49 AM UTC
Confidence Status Last Update:: 23/Jan/25 10:00 AM

Details

Description

Attachments

Issue Links

Activity

People

Dates