The primary input to CE is statistics, and most of all histograms. This tasks implements C++ unit tests that strictly test the CE module in isolation. Testing should be done as described in the design document Test Plan section.
That is:
Tests should vary queries in at least the following ways:
- data types of constants in comparison conditions,
- distance (spread) between constants,
- edge case values (empty string, 0, very long string, max integer, etc) .
Assert that given certain statistics, the CE module produces a certain estimate for various queries.
The statistics used in these tests are hand-crafted histograms that explore various edge cases, such as 1-bucket and 2-bucket histograms, boundary values of all types, etc.