CBR: Random/sub-optimal index picked in the face of cardinality estimate of zero

XMLWordPrintableJSON

    • Type: Bug
    • Resolution: Unresolved
    • Priority: Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • None
    • Query Optimization
    • ALL
    • Hide

      See body

      Show
      See body
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      In SERVER-98102, the cost formula was apparently changed so that, in the presence of index {a: 1} and index {a:1, b:1} the right index will be chosen based on cost.

      However, if the cardinality estimate is zero, both of those indexes continue to have the same almost-zero cost. This means that the first enumerated index will be picked, but the plan enumerator enumerates indexes based on their order of creation, which can change e.g. because of a mongorestore.

      Both indexes may get a cardinality estimate of zero due to the value not being present in the sample, but the indexes are not equivalent – one may perfectly match the predicate while the other may only match it partially. They should not have the same cost.

      Consider this example:

      db.adminCommand({setParameter: 1, featureFlagCostBasedRanker: true});
      db.adminCommand({setParameter: 1, internalQueryCBRCEMode: "samplingCE"});
      
      db.coll.drop();
      db.coll.insert({c:1});
      db.coll.createIndexes([{a:1}, {a:1, b:1}]);
      db.coll.getIndexes();
      db.coll.find({a:1, b:1}, {_id: 0, a: 1, b:1}).explain().queryPlanner.winningPlan;
      
      db.coll.drop();
      db.coll.insert({c:1});
      db.coll.createIndexes([{a:1, b:1}, {a:1}]);
      db.coll.getIndexes();
      db.coll.find({a:1,b:1}, {_id: 0, a:1 , b:1}).explain().queryPlanner.winningPlan;
      
        

       

      Depending on the order in which the indexes were created, the plan can be either:

      {
        isCached: false,
        usedJoinOptimization: false,
        stage: 'PROJECTION_SIMPLE',
        costEstimate: 0.00001167,
        cardinalityEstimate: 0,
        estimatesMetadata: { ceSource: 'Metadata' },
        transformBy: { _id: 0, a: 1, b: 1 },
        inputStage: {
          usedJoinOptimization: false,
          stage: 'FETCH',
          costEstimate: 0.00001167,
          cardinalityEstimate: 0,
          estimatesMetadata: { ceSource: 'Metadata' },
          filter: { b: { '$eq': 1 } },
          nss: 'test.coll',
          inputStage: {
            usedJoinOptimization: false,
            stage: 'IXSCAN',
            costEstimate: 0.00001167,
            cardinalityEstimate: 0,
            estimatesMetadata: { ceSource: 'Metadata' },
            numKeysEstimate: 0,
            nss: 'test.coll',
            keyPattern: { a: 1 },
            indexName: 'a_1',
            isMultiKey: false,
            multiKeyPaths: { a: [] },
            isUnique: false,
            isSparse: false,
            isPartial: false,
            indexVersion: 2,
            direction: 'forward',
            indexBounds: { a: [ '[1, 1]' ] }
          }
        }
      }
       

      or 

      {
        isCached: false,
        usedJoinOptimization: false,
        stage: 'PROJECTION_COVERED',
        costEstimate: 0.00001167,
        cardinalityEstimate: 0,
        estimatesMetadata: { ceSource: 'Sampling' },
        transformBy: { _id: 0, a: 1, b: 1 },
        inputStage: {
          usedJoinOptimization: false,
          stage: 'IXSCAN',
          costEstimate: 0.00001167,
          cardinalityEstimate: 0,
          estimatesMetadata: { ceSource: 'Sampling' },
          numKeysEstimate: 0,
          nss: 'test.coll',
          keyPattern: { a: 1, b: 1 },
          indexName: 'a_1_b_1',
          isMultiKey: false,
          multiKeyPaths: { a: [], b: [] },
          isUnique: false,
          isSparse: false,
          isPartial: false,
          indexVersion: 2,
          direction: 'forward',
          indexBounds: { a: [ '[1, 1]' ], b: [ '[1, 1]' ] }
        }
      }
       

      Those plans may have the same estimate and the same cost, but they are in no shape or form equivalent in performance – one does a covering index scan, the other uses an IXSCAN on a potentially inferior index, followed by a FETCH. In the case the estimate was stale or incorrect, those two plans will have vastly different performance characteristics.

            Assignee:
            Unassigned
            Reporter:
            Philip Stoev
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated: