Cost of INDEX_PROBE_NODE does not take into account number of index keys visited per 1 probe

XMLWordPrintableJSON

    • Type: Bug
    • Resolution: Unresolved
    • Priority: Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • None
    • Query Optimization
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      This plan is costed incorrectly:

      queryPlan: {
              stage: 'INDEXED_NESTED_LOOP_JOIN_EMBEDDING',
              planNodeId: 4,
              costEstimate: 1.0263967529296874,
              cardinalityEstimate: 0,
              estimatesMetadata: { ceSource: 'Metadata' },
              leftEmbeddingField: 'supplier',
              rightEmbeddingField: 'none',
              joinPredicates: [ 's_suppkey = ps_suppkey' ],
              inputStages: [
                {
                  stage: 'COLLSCAN',
                  planNodeId: 1,
                  costEstimate: 0.9219967529296874,
                  cardinalityEstimate: 522,
                  estimatesMetadata: { ceSource: 'Code' },
                  filter: {
                    '$and': [
                      {
                        '$or': [
                          { s_acctbal: { '$gte': 4327.86 } },
                          { s_acctbal: { '$in': [ -707.02, 91.39 ] } },
                          { s_nationkey: { '$in': [ 3, 16 ] } }
                        ]
                      },
                      { s_acctbal: { '$not': { '$lt': 1333.75 } } }
                    ]
                  },
                  nss: 'plan_stability_join_opt.supplier',
                  direction: 'forward'
                },
                {
                  stage: 'FETCH',
                  planNodeId: 3,
                  filter: { ps_supplycost: { '$eq': 904.85 } },
                  nss: 'plan_stability_join_opt.partsupp',
                  inputStage: {
                    stage: 'INDEX_PROBE_NODE',
                    planNodeId: 2,
                    nss: 'plan_stability_join_opt.partsupp',
                    keyPattern: { ps_suppkey: 1 },
                    indexName: 'ps_suppkey_1',
                    isMultiKey: false,
                    isUnique: false,
                    isSparse: false,
                    isPartial: false,
                    indexVersion: 2
                  }
                }
              ]
            },

      The cost of the INDEX_PROBE_NODE is not reported but the cost of the overall join is 1.02, and the cost of the left join input is 0.92. This leaves a cost of 0.10 for the right input and the join itself, which is too low. I suspect that the cost of the right input assumes that 522 index keys will be fetched into the index, whereas in fact 41760 will be examined.

       

      So this plan has an unnaturally low cost and is picked instead of a better plan.

       

      To reproduce, use the tpch-0.1 dataset:

       

       const pipeline = [
       {"$lookup":{"from":"supplier","localField":"ps_suppkey","foreignField":"s_suppkey","pipeline":[
       {"$match":{"$nor":[{"s_acctbal":{"$lt":1333.75}}]}},
       {"$match":{"$or":[{"s_nationkey":16},{"s_acctbal":{"$eq":-707.02}},{"s_nationkey":3},{"s_acctbal":{"$eq":91.39}},{"s_acctbal":{"$gte":4327.86}}]}}],"as":"supplier"}},
       {"$unwind":"$supplier"},
       {"$match":{"$and":[{"ps_supplycost":{"$eq":904.85}}]}}];
       db.adminCommand({setParameter: 1, internalEnableJoinOptimization: false});const start = performance.now();  
      db.partsupp.aggregate(pipeline).toArray().length;
      const end = performance.now();  
      console.log(`Elapsed time: ${end - start} ms`); db.adminCommand({setParameter: 1, internalEnableJoinOptimization: true});const start = performance.now();  
      db.partsupp.aggregate(pipeline).toArray().length;
      const end = performance.now();  
      console.log(`Elapsed time: ${end - start} ms`); 

      The plan with join optimization enabled is slower than the syntactic join order plan.

      The execution stats have the following fragment, which I assume corresponds to the index probe:

       

                    innerStage: {
                      stage: 'ixseek',
                      planNodeId: 3,
                      nReturned: 41760,
                      executionTimeMillisEstimate: 10,
                      opens: 522,
      ...
                      indexName: 'ps_suppkey_1',
                      keysExamined: 41760,
                      seeks: 522,
                      numReads: 42282,
      ...
                    }
       

      That fragment is doing a lot of work, so it can not possibly cost > 0.10

            Assignee:
            Unassigned
            Reporter:
            Philip Stoev
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated: