-
Type:
Task
-
Resolution: Won't Do
-
Priority:
Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
None
-
Query Optimization
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Sequential scan sampling CE is used in test suites for its determinism. However, it is highly susceptible to partially sorted data (see SERVER-123925), which can skew results.
Randomly shuffle the TPC-H datasets before running tests, then update the test suites to use the shuffled datasets. This combines the benefits of both sampling approaches: the randomness of random CE and the predictability of sequential scan sampling CE.
Ad a result, TPC-H-based tests pass consistently with shuffled input data and show no regression from the partial-sort sensitivity issue.
- is related to
-
SERVER-123925 Join optimization: cardinality estimate for join is off by 10^2 (overestimation)
-
- In Code Review
-
-
SERVER-127872 CBR testing: Implement stride sampling
-
- Closed
-