-
Type:
Improvement
-
Resolution: Unresolved
-
Priority:
Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
Query Optimization
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Overview
CardinalityEstimator::clampZeroEstimates replaces every approximate-source zero CE with the same kMinCE = 1, regardless of how many predicates were absorbed into the index bounds. This violates the prefix-selectivity heuristic that a longer index prefix is at least as selective as a shorter one.
Background
SERVER-123070 introduced the clamp on approximate-source zero cardinality estimates so that structurally distinct plans no longer tie at the per-stage minCost. It works for the structurally distinct cases the ticket was scoped to fix (covered IXSCAN over compound index vs FETCH+IXSCAN with residual filter on disjoint fields), where the cost differential survives the uniform clamp.
It does not preserve the prefix-selectivity heuristic. Consider the predicate {{ {a: 1, b: 1, c: 1} }} with indexes {{ {a: 1} }} and {{ {a: 1, b: 1} }} and no matching documents in the sample:
- IXSCAN(a_1, a:[1,1]) — pushes only a==1 into the bounds, leaves b==1 AND c==1 as a residual filter. Approximate-source CE = 0, clamped to 1.
- IXSCAN(a_1_b_1, a:[1,1], b:[1,1]) — pushes a==1 AND b==1 into the bounds, leaves c==1 as a residual filter. Approximate-source CE = 0, clamped to 1.
After clamping both IXSCANs to CE = 1, the cost ordering is decided by small calibrated terms (the per-field IXSCAN cost differential) and CBR can pick the structurally weaker plan.
Scope of Work
The clamp must produce CEs that respect the invariant: an IXSCAN whose bounds absorb a superset of another IXSCAN's bound predicates on the same fields must have a clamped CE no greater than the other's.
Possible approaches:
- Scale the floor by a function of the number of equality bounds pushed into the index — for example kMinCE / 2^(numEqualityBounds). Preserves strict monotonicity for nested prefixes.
- Detect the prefix-subset relationship across enumerated plans during clampZeroEstimates and assign a strictly ordered floor.
Either way, the post-clamp invariant must be cardinalityEstimate(longer-prefix) <= cardinalityEstimate(shorter-prefix) for compatible candidate plans.
Acceptance Criteria
- For predicate {{ {a: 1, b: 1, c: 1} }} against indexes {{ {a: 1} }} and {{ {a: 1, b: 1} }} with no matching documents and the planner clamping all approximate zeros, CBR picks the compound index regardless of index creation order.
- The commented-out assertion in jstests/noPassthroughWithMongod/query/cbr/cbr_sampling.js (inside runZeroCEScenarios, marked with a TODO referencing this ticket) is enabled and passes.
- No regressions in the existing zero-CE assertions.
Technical Notes
- The clamp lives in CardinalityEstimator::clampZeroEstimates in src/mongo/db/query/compiler/optimizer/cost_based_ranker/cardinality_estimator.cpp.
- The motivating scenario surfaced during review of
SERVER-123070; see PR 52398 review thread on cbr_sampling.js:212.
- is related to
-
SERVER-123070 CBR: Random/sub-optimal index picked in the face of cardinality estimate of zero
-
- Closed
-