Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-104501

Create MatchExpressions from IntervalBounds to estimate RIDs via sampling

    • Type: Icon: Task Task
    • Resolution: Unresolved
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • None
    • Query Optimization
    • None
    • 3
    • TBD
    • None
    • None
    • None
    • None
    • None
    • None

      The current approach used by CBR sampling CE to estimate IntervalBounds is to scan the sample and generate keys out of it on each estimation call. This is very inefficient.

      A potentially much more efficient approach is to transform each interval into a MatchExpression, and apply the expression directly to the sample. This task is about:

      • Create equivalent MatchExpression (ME) from an IntervalBounds.
      • Replace the calls to
        SamplingEstimator::estimateRIDs with SamplingEstimator::estimateCardinality.
      • Benchmark and compare the performance of interval estimation via SamplingEstimator::estimateRIDs as a baseline, and compare that to the new estimation method based on Interval->ME conversion. Benchmarking can be done at least at two levels:
        • Microbenchmark estimation itself excluding/including the conversion Interval->ME, and
        • Run Genny multi-planner benchmarks (or similar).

            Assignee:
            milena.ivanova@mongodb.com Milena Ivanova
            Reporter:
            timour.katchaounov@mongodb.com Timour Katchaounov
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated: