-
Type:
Improvement
-
Resolution: Works as Designed
-
Priority:
Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
Query Execution
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Consider the following example query:
let pipeline = [
{$project: {a: 1, b: 1, c: {$const: "MY_CONST"}}},
{$match: {$expr: {$and: [{$eq: ["$a", 4]}, {$eq: ["$b", 5]}, {$eq: ["$c", "MY_CONST"]}]}}}
];
let explain = db.coll.explain("executionStats").aggregate(pipeline)
printjson(explain);
Generally, DocumentSource heuristic rewrites are able to push predicates past projections. In this case, the predicates on fields a and b can be pushed beneath the $project stage, since these fields are "simple inclusions". In contrast, the predicate on c cannot be trivially pushed past the $project, because the project adds field c with the constant value "MY_CONST" to every document. Pushing the equalities on a and b past the projection could be critical for performance, since these are sargable predicates which could use indexes if the $match is pushed down. So to summarize, our expectation is that the query is rewritten to something like this:
[
{$match: {$expr: {$and: [{$eq: ["$a", 4]}, {$eq: ["$b", 5]}]}}}
{$project: {a: 1, b: 1, c: {$const: "MY_CONST"}}},
{$match: {$expr: {$eq: ["$c", "MY_CONST"]}}}
]
However, if you run the explain as shown above you will see that no $match pushdown occurs. We should fix this.
I found that a slight variation of the same query will behave as expected. This may be a suitable workaround in some contexts:
// Variation where the comparison to a constant is not a $expr pipeline = [ {$project: {a: 1, b: 1, c: {$const: "MY_CONST"}}}, {$match: {$expr: {$and: [{$eq: ["$a", 4]}, {$eq: ["$b", 5]}]}, c: "MY_CONST"}} ]; explain = db.left.explain("executionStats").aggregate(pipeline) printjson(explain);
By expressing the predicate on c as a match expression outside of $expr, the system is able to perform the $match pushdown as expected:
"executionStats" : {
"executionSuccess" : true,
"nReturned" : 0,
"executionTimeMillis" : 3,
"totalKeysExamined" : 0,
"totalDocsExamined" : 1,
"executionStages" : {
"isCached" : false,
"stage" : "PROJECTION_DEFAULT",
"nReturned" : 0,
"executionTimeMillisEstimate" : 0,
"works" : 2,
"advanced" : 0,
"needTime" : 1,
"needYield" : 0,
"saveState" : 2,
"restoreState" : 1,
"isEOF" : 1,
"transformBy" : {
"_id" : true,
"a" : true,
"b" : true,
"c" : {
"$const" : "MY_CONST"
}
},
"inputStage" : {
"stage" : "COLLSCAN",
"filter" : {
"$and" : [
{
"$expr" : {
"$and" : [
{
"$eq" : [
"$a",
{
"$const" : 4
}
]
},
{
"$eq" : [
"$b",
{
"$const" : 5
}
]
}
]
}
}
]
},
"nReturned" : 0,
"executionTimeMillisEstimate" : 0,
"works" : 2,
"advanced" : 0,
"needTime" : 1,
"needYield" : 0,
"saveState" : 2,
"restoreState" : 1,
"isEOF" : 1,
"direction" : "forward",
"docsExamined" : 1
}
}
}
},
"nReturned" : NumberLong(0),
"executionTimeMillisEstimate" : NumberLong(3)
},
{
"$match" : {
"c" : {
"$eq" : "MY_CONST"
}
},
"nReturned" : NumberLong(0),
"executionTimeMillisEstimate" : NumberLong(3)
}
While this may appear to be a contrived query, it came up in the context of a relational migration project. Specifically, the application has a view which projects some columns from a base table and adds an additional column with a constant value. Then, there is higher-level view which joins with this lower-level view. When the join is converted into MQL, it is expressed as a $lookup into the view, and $expr is needed to express the conjunctive join predicate – including a join predicate on the constant column! That's how we end up with an $expr matching against a projected constant column, which defeats pushdown of the other predicates as described above.
- is related to
-
SERVER-82570 Bucket-level filters in time-series are translated to SBE as trivially "true"
-
- Closed
-
-
SERVER-83462 Remove more imprecise InternalExpr predicates in residual filters
-
- Closed
-
-
SERVER-106505 $sequentialCache optimization can interfere with $match pushdown in a $lookup sub-pipeline
-
- Closed
-