Loading...

XML

Word

Printable

JSON

Type: Improvement
Resolution: Unresolved
Priority: Major - P3
Fix Version/s: None
Affects Version/s: None
Component/s: None
Labels:
None

Assigned Teams:

Query Optimization
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

The precision of sampling estimates depends on the number of qualifying documents in the sample. When this number is small (in the literature under 10) we have no guarantees for the precision of the estimate.

This ticket aims to improve the situation in this case. There are several options to be investigated:

when the qualifying documents are less than 10, round them to 10 and compute the estimate from here. This will not improve the precision per se, but provides an upper bound to the estimate, which can be sufficient for the CBR to pick a good plan.
when the qualifying documents are less than 10, return the estimate with an error status. This can allow for example CBR to switch to histograms.
always assign a 'reliability' metric to the estimate, and use a low reliability for estimates derived from too few qualifying samples.

is depended on by

SERVER-99095 samplingCE: Estimates gyrate wildly for low cardinality predicates

Needs Scheduling

Assignee:: Unassigned
Reporter:: Milena Ivanova
Participants:: Milena Ivanova
Votes:: 0 Vote for this issue
Watchers:: 5 Start watching this issue

Created:: Feb 04 2025 05:26:13 PM UTC
Updated:: Mar 04 2026 01:38:57 PM UTC

Details

Description

Attachments

Issue Links

Activity

People

Dates