Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-97504

Enable distinction of CE estimation algorithm for range queries on multi-key fields.

    • Type: Icon: Task Task
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 8.1.0-rc0
    • Affects Version/s: None
    • Component/s: None
    • None
    • Query Optimization
    • Fully Compatible

      Filtering on arrays semantics vary across operators. Specifically, 

      [{$match: {a: {$gte: n, $lte: m}}}]
      

       differs from 

      [{$match: {a: {$elemMatch: {$gte: n, $lte: m}}}}]
      

      on two regards:

      1. The former considers both scalar and array values whereas `elemMatch` considers only arrays
      2. The former when filtering on arrays translates the condition to a conjunction of intervals, this leads to a mismatch in the output results of the two queries.

      Example for (2), Collection containing the following documents:

      { "_id" : "R1", "a" : [ -100, -20] }
      { "_id" : "R2", "a" : [ 5, 7 ] }
      { "_id" : "R3", "a" : [ 100, 1000 ] }
      { "_id" : "R4", "a" : [ -20, 5 ] }
      { "_id" : "R5", "a" : [ 7, 100 ] }
      { "_id" : "R6", "a" : [ -100, 100 ] }
      

      Query: 

      [{$match: {a: {$gte: 0, $lte: 10}}}] 
      

      Will split the condition to (-inf, 10] and [0, +inf). And will include in the output the documents "R2", "R4", "R5", and "R6"
      Query:

      [{$match: {a: {$elemMatch: {$gte: n, $lte: m}}}}]
      

       Will include in the output only the documents "R2", "R4", "R5"

      This semantic difference has an impact on the cardinality estimation algorithm. The MaxDiff histogram already implements different strategies to accommodate both cases.

      This ticket introduces an additional query semantics input to HistogramCardinalityEstimator to allow the caller to decide which of the two cardinality estimation algorithms should the histogram estimator use for range filter cardinality estimation over arrays.

            Assignee:
            matt.olma@mongodb.com Matt Olma
            Reporter:
            matt.olma@mongodb.com Matt Olma
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated:
              Resolved: