samplingCE: cardinalityEstimate is incorrect for $ne

XMLWordPrintableJSON

    • Type: Improvement
    • Resolution: Unresolved
    • Priority: Blocker - P1
    • None
    • Affects Version/s: None
    • Component/s: None
    • Query Optimization
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      samplingCE returns the correct numKeysEstimate for $ne but the wrong cardinalityEstimate. So the top-level FETCH has an almost-zero cost, so a bad plan may win.

      To reproduce:

      db.foo.drop();
      db.adminCommand({setParameter: 1, planRankerMode: "samplingCE"});
      db.foo.createIndex({a:1});
      db.foo.insert({a: "abc"});
      # This returns 1 row
      db.foo.aggregate([{ '$match': { 'a': { '$ne': null }}}]);
      # This returns cardinalityEstimate = 0
      db.foo.aggregate([{ '$match': { 'a': { '$ne': null }}}]).explain().queryPlanner.winningPlan;
      ...
          cardinalityEstimate: 0,
          numKeysEstimate: 1,
      ...
      

      This is not an off-by-one issue , it is observed with various dataset sizes and selectivites. Contrast with histogramCE:

          cardinalityEstimate: 1,
          numKeysEstimate: 1,
      

      git revision 068f5a66c02dc39afe1958556ae5883b34d29f41

            Assignee:
            Kartal Kaan Bozdogan
            Reporter:
            Philip Stoev
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated: