|
If we make the following changes to this test jstests/cqf/residual_pred_costing.js test
const bulk = t.initializeUnorderedBulkOp();
|
const nDocs = 2000;
|
for (let i = 0; i < nDocs; i++) {
|
bulk.insert({a: i % 10, b: i % 10, c: i % 10, d: i % 10, p: 1, e: i % 12});
|
}
|
assert.commandWorked(bulk.execute());
|
|
assert.commandWorked(t.createIndex({e: 1, a: 1, b: 1, c: 1, d: 1}));
|
assert.commandWorked(t.createIndex({a: 1, d: 1}));
|
|
const res = runWithParams([{key: "internalCascadesOptimizerFastIndexNullHandling", value: true}],
|
() => t.explain("executionStats").aggregate([
|
{$match: {a: {$eq: 0}, b: {$eq: 0}, c: {$eq: 0}, p: {$eq: 1}}},
|
{$sort: {d: 1}}
|
]));
|
assert.eq(nDocs * 0.1, res.executionStats.nReturned);
|
A suboptimal plan which involves index scan on {e: 1, a: 1, b: 1, c: 1, d: 1} is selected. Bank of the envelope calculations suggests that other plan should be preferred:
We have a collection of 2000 documents, and 200 of them are selected. Index Scan on {e: 1, a: 1, b: 1, c: 1, d: 1} plan saves 20ns per doc, but then we have to pay 1100ns per seek, which is equalled to 200*1100 = 220,000ns. On scans the index plan is 180,000ns more expensive than collection scan plan. There should be any saving on filter since we still need to apply filter on all 4 predicates on both plans. The same for the sorting stage, in both plans we have to sort 200 documents.
|