-
Type: Task
-
Resolution: Fixed
-
Priority: Major - P3
-
Affects Version/s: None
-
Component/s: None
-
None
-
Query Optimization
-
Fully Compatible
We should have integration and/or unit tests that exercise the following scenarios in histogram generation and in estimation of predicates:
- minimum and maximum values for each type (most importantly numeric)
- inf/NaN/invalid values- if we can insert these into a collection, we have to make sure we handle them correctly during bucket creation/estimation
- a wide range of values including extreme types
- extreme date/time values
- Decimal128 types that are too large to fit in a double
- very large arrays
- very large strings
We need to ensure both that histogram creation on these types results in a valid histogram, and that cardinality estimation for these values (both when present and when absent from a histogram) works adequately.
- is depended on by
-
SERVER-72819 Estimate the cardinality of extreme values in histograms
- Closed
- is related to
-
SERVER-72850 Allow strings with unicode characters to be added to histograms
- Backlog
-
SERVER-72997 [CQF] Allow histograms with number of buckets equal to number of types
- Closed
- related to
-
SERVER-72807 [CQF] Allow NaN to be added to histograms
- Closed