[SERVER-75077] Suboptimal plan selection in Bonsai Created: 21/Mar/23  Updated: 26/Apr/23

Status: Backlog
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Alexander Ignatyev Assignee: Backlog - Query Optimization
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to SERVER-71803 Explain displays incorrect cost for s... Closed
is related to SERVER-76544 [CQF] Fix for residual predicate sele... Closed
Assigned Teams:
Query Optimization
Operating System: ALL
Participants:

 Description   

If we make the following changes to this test jstests/cqf/residual_pred_costing.js test

const bulk = t.initializeUnorderedBulkOp();
const nDocs = 2000;
for (let i = 0; i < nDocs; i++) {
    bulk.insert({a: i % 10, b: i % 10, c: i % 10, d: i % 10, p: 1, e: i % 12});
}
assert.commandWorked(bulk.execute());
 
assert.commandWorked(t.createIndex({e: 1, a: 1, b: 1, c: 1, d: 1}));
assert.commandWorked(t.createIndex({a: 1, d: 1}));
 
const res = runWithParams([{key: "internalCascadesOptimizerFastIndexNullHandling", value: true}],
                          () => t.explain("executionStats").aggregate([
                              {$match: {a: {$eq: 0}, b: {$eq: 0}, c: {$eq: 0}, p: {$eq: 1}}},
                              {$sort: {d: 1}}
                          ]));
assert.eq(nDocs * 0.1, res.executionStats.nReturned);

A suboptimal plan which involves index scan on {e: 1, a: 1, b: 1, c: 1, d: 1} is selected. Bank of the envelope calculations suggests that other plan should be preferred:

We have a collection of 2000 documents, and 200 of them are selected. Index Scan on {e: 1, a: 1, b: 1, c: 1, d: 1} plan saves 20ns per doc, but then we have to pay 1100ns per seek, which is equalled to 200*1100 = 220,000ns. On scans the index plan is 180,000ns more expensive than collection scan plan. There should be any saving on filter since we still need to apply filter on all 4 predicates on both plans. The same for the sorting stage, in both plans we have to sort 200 documents.



 Comments   
Comment by Alexander Ignatyev [ 21/Mar/23 ]

Linking "SERVER-71803 Explain displays incorrect cost for some ABT nodes" ticket since fixing that issue would allow investigate the suboptimal plan selection more effectively just checking what ABT makes one plan more preferable other others, which is not possible right now since the cost is not correctly displayed in the explain.

Generated at Thu Feb 08 06:29:15 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.