-
Type:
Bug
-
Resolution: Unresolved
-
Priority:
Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
None
-
Query Optimization
-
None
-
None
-
None
-
None
-
None
-
None
-
None
This plan is costed incorrectly:
queryPlan: {
stage: 'INDEXED_NESTED_LOOP_JOIN_EMBEDDING',
planNodeId: 4,
costEstimate: 1.0263967529296874,
cardinalityEstimate: 0,
estimatesMetadata: { ceSource: 'Metadata' },
leftEmbeddingField: 'supplier',
rightEmbeddingField: 'none',
joinPredicates: [ 's_suppkey = ps_suppkey' ],
inputStages: [
{
stage: 'COLLSCAN',
planNodeId: 1,
costEstimate: 0.9219967529296874,
cardinalityEstimate: 522,
estimatesMetadata: { ceSource: 'Code' },
filter: {
'$and': [
{
'$or': [
{ s_acctbal: { '$gte': 4327.86 } },
{ s_acctbal: { '$in': [ -707.02, 91.39 ] } },
{ s_nationkey: { '$in': [ 3, 16 ] } }
]
},
{ s_acctbal: { '$not': { '$lt': 1333.75 } } }
]
},
nss: 'plan_stability_join_opt.supplier',
direction: 'forward'
},
{
stage: 'FETCH',
planNodeId: 3,
filter: { ps_supplycost: { '$eq': 904.85 } },
nss: 'plan_stability_join_opt.partsupp',
inputStage: {
stage: 'INDEX_PROBE_NODE',
planNodeId: 2,
nss: 'plan_stability_join_opt.partsupp',
keyPattern: { ps_suppkey: 1 },
indexName: 'ps_suppkey_1',
isMultiKey: false,
isUnique: false,
isSparse: false,
isPartial: false,
indexVersion: 2
}
}
]
},
The cost of the INDEX_PROBE_NODE is not reported but the cost of the overall join is 1.02, and the cost of the left join input is 0.92. This leaves a cost of 0.10 for the right input and the join itself, which is too low. I suspect that the cost of the right input assumes that 522 index keys will be fetched into the index, whereas in fact 41760 will be examined.
So this plan has an unnaturally low cost and is picked instead of a better plan.
To reproduce, use the tpch-0.1 dataset:
const pipeline = [ {"$lookup":{"from":"supplier","localField":"ps_suppkey","foreignField":"s_suppkey","pipeline":[ {"$match":{"$nor":[{"s_acctbal":{"$lt":1333.75}}]}}, {"$match":{"$or":[{"s_nationkey":16},{"s_acctbal":{"$eq":-707.02}},{"s_nationkey":3},{"s_acctbal":{"$eq":91.39}},{"s_acctbal":{"$gte":4327.86}}]}}],"as":"supplier"}}, {"$unwind":"$supplier"}, {"$match":{"$and":[{"ps_supplycost":{"$eq":904.85}}]}}]; db.adminCommand({setParameter: 1, internalEnableJoinOptimization: false});const start = performance.now(); db.partsupp.aggregate(pipeline).toArray().length; const end = performance.now(); console.log(`Elapsed time: ${end - start} ms`); db.adminCommand({setParameter: 1, internalEnableJoinOptimization: true});const start = performance.now(); db.partsupp.aggregate(pipeline).toArray().length; const end = performance.now(); console.log(`Elapsed time: ${end - start} ms`);
The plan with join optimization enabled is slower than the syntactic join order plan.
The execution stats have the following fragment, which I assume corresponds to the index probe:
innerStage: {
stage: 'ixseek',
planNodeId: 3,
nReturned: 41760,
executionTimeMillisEstimate: 10,
opens: 522,
...
indexName: 'ps_suppkey_1',
keysExamined: 41760,
seeks: 522,
numReads: 42282,
...
}
That fragment is doing a lot of work, so it can not possibly cost > 0.10
- is blocked by
-
SERVER-120017 Initial calibration of HJ vs INLJ
-
- Closed
-
- is related to
-
SERVER-121264 Check for currentOp result before dereferencing in write_conflicts_with_non_txns.js
-
- Closed
-