-
Type:
Improvement
-
Resolution: Done
-
Priority:
Blocker - P1
-
None
-
Affects Version/s: None
-
Component/s: None
-
Query Optimization
-
Fully Compatible
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Sampling performance suffers as the $in list size increases. I tested up to an array of 50k elements and sampling takes nearly 700ms compared to 20ms with multiplanning. The degradation appears to be linear as shown in the charts. this is a set of queries, numbered along the bottom. each query has a conjunctive predicate on mark and salary field. mark predicate is a $in list and salary predicate is a range. each query is run 4 times. Once each with the mark and salary index hinted. Once each with no hint using samplingCE (red line) and multi-planning (green line). We can see the samplingCE perf is much slower than multi-planning and the flame graph shows a significant amount of time spent in sampling CE code. Also attached a flame graph.
- is related to
-
SERVER-112930 Slow estimation of $all with many arguments
-
- Closed
-
- related to
-
SERVER-122687 FLE2 conjunctions with huge tag counts slow if CBR is enabled
-
- Backlog
-
-
SERVER-108958 Improve sampling CE performance in the presence of large $in lists: Index bounds checks
-
- Closed
-
-
SERVER-124126 Prevent CE caching above a threshold number of intervals in an index bounds set
-
- Closed
-
-
SERVER-124542 Investigate CE cache performance degradation for large MatchExpressions
-
- Needs Scheduling
-
-
SERVER-124880 Add synthetic benchmarks for queries with huge number of index intervals
-
- Needs Scheduling
-