-
Type: Bug
-
Resolution: Unresolved
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
None
-
Query Optimization
-
ALL
-
-
QO 2025-02-03, QO 2025-02-17
If the number of distinct values in the collection becomes larger than the numberBuckets, histogram estimates for $gt become wildly inaccurate – they are either zero or some very low value.
Enterprise test> db.foo.find({a: {$gte: "mmmmmmmm"}}).explain().queryPlanner.winningPlan.cardinalityEstimate; 0 Enterprise test> db.foo.find({a: {$gte: "mmmmmmmm"}}).count(); 92
The estimate is incorrect/zero in both directions – $gt and $lt the pivot point:
Enterprise test> db.foo.find({a: {$gte: "mmmmmmmm"}}).explain().queryPlanner.winningPlan.cardinalityEstimate; 0 Enterprise test> db.foo.find({a: {$lt: "mmmmmmmm"}}).explain().queryPlanner.winningPlan.cardinalityEstimate; 0
The histogram is quite skewed, for which I will open a separate issue.
- is related to
-
SERVER-99629 histogramCE: Degenerate histogram if NDV > numberBuckets
- Needs Scheduling