-
Type: Improvement
-
Resolution: Unresolved
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
None
-
Query Optimization
If the $match expression contains $function, the function will be run once for every document in the sample, so hundreds of times. This would not be advised if this function is expensive and/or has some state attached to it (e.g. logging).
I think the samplingCE needs to detect the presence of $function and possibly other expensive operations and refuse to estimate them via sampling. Alternatively, the predicate being sampled should be broken down into cheap and expensive parts and the cheap parts estimated.