-
Type:
Improvement
-
Resolution: Unresolved
-
Priority:
Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
None
-
Query Optimization
-
None
-
None
-
None
-
None
-
None
-
None
-
None
In IndexScanNode input cardinality estimation for samplingCE the optimization avoiding calculation of index keys, is avoiding any index scan node that contains a residual filter.
Logically, we can avoid calculating index keys and use a MatchExpression to estimate the interval, with a minor difference in calculation (not including the residual filter for calculating input CE)
We kept the condition as there is a set of tests failing
Example of such tests (along with some debug information):
[js_test:plan_stability2] {">>>pipeline": [{"$match":{"$and":[{"field8_int_idx":{"$ne":353},"field45_bool":{"$type":"bool"}},{"$or":[{"field31_list_idx":{"$lte":220},"field25_str_idx":{"$eq":"Gl"}},{"field19_datetime_idx":{"$gte":"2024-01-27T00:00:00.000Z"},"field24_mixed_idx":{"$all":[]}},{"field8_int_idx":{"$lt":831},"field28_datetime_idx":{"$eq":"2024-01-10T00:00:00.000Z"},"field23_dict_idx":{"$gte":{"c":1,"b":3}},"field6_mixed_idx":{"$in":[6,99562]}},{"field17_int_idx":{"$lte":239},"field44_int":{"$lte":7}}]},{"$or":[{"$or":[{"field16_str_idx":{"$ne":"Z"},"field5_dict_idx":{"$eq":{"b":1,"e":1}}},{"field45_bool":{"$lte":false},"field16_str_idx":{"$gte":"hS"}}]},{"field4_list_idx":{"$ne":21},"field35_int_idx":{"$eq":4539}}]}],"field21_Decimal128_idx":{"$type":"decimal"},"field26_int_idx":{"$lt":9043},"field19_datetime_idx":{"$ne":"2024-01-05T00:00:00.000Z"},"$nor":[{"field24_mixed_idx":["k","r","l","p"],"field31_list_idx":{"$ne":50}},{"field4_list_idx":["t","i","g","y","v"],"field18_bool_idx":{"$exists":true},"field47_Timestamp":{"$eq":{"$timestamp":{"t":1755595335,"i":0}}}}]}},{"$skip":52},{"$project":{"field38_Timestamp":1,"_id":0}}], Specifically the following nodes make different estimations compared to estimateKeysScanned: [j0] node->filter: { field21_Decimal128_idx: { $type: [ 19 ] } } [j0] node->bounds: field #0['field21_Decimal128_idx']: [inf, nan] [j0] est.outCE: { Cardinality: 0.0, Source: "Sampling" } [j0] estimateKeysScanned: { Cardinality: 100000.0, Source: "Sampling" } [j0] prefix: { field21_Decimal128_idx: [ "[inf, nan]" ] } [j0] isEqPrefix: 0 [j0] ridsEstFunct(*prefix.eqPrefixPtr, nullptr) : { Cardinality: 0.0, Source: "Sampling" }
This ticket should re-evaluate this condition and address the problem.