-
Type: Improvement
-
Resolution: Won't Do
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
None
-
Query Optimization
This resulted from the investigation around SERVER-72899.
If we want to estimate a range (l, u) that falls within a bucket with lower bound L and upper bound U, ndv distinct values, and rangeFreq values in the bucket, we may obtain a negative estimate. This is because we estimate this range as:
card(<u) - card(<l) = card(<=u) - card(=u) - card(<l) = card(<=u) - card(<l) - rangeFreq/ndv
If rangeFreq/ndv > card(<=u) - card(<l), we obtain a negative estimate. Since the two sides of this inequality are independent, we have no guarantees currently that we can't encounter this case.
The current fix (SERVER-72899) is to clamp this to 0.0, but we can do better. See comments for more.
- related to
-
SERVER-72899 Invalid cardinality estimate
- Closed