-
Type:
Improvement
-
Resolution: Works as Designed
-
Priority:
Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
Query Execution
-
None
-
3
-
TBD
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Consider the following example query:
let pipeline = [ {$project: {a: 1, b: 1, c: {$const: "MY_CONST"}}}, {$match: {$expr: {$and: [{$eq: ["$a", 4]}, {$eq: ["$b", 5]}, {$eq: ["$c", "MY_CONST"]}]}}} ]; let explain = db.coll.explain("executionStats").aggregate(pipeline) printjson(explain);
Generally, DocumentSource heuristic rewrites are able to push predicates past projections. In this case, the predicates on fields a and b can be pushed beneath the $project stage, since these fields are "simple inclusions". In contrast, the predicate on c cannot be trivially pushed past the $project, because the project adds field c with the constant value "MY_CONST" to every document. Pushing the equalities on a and b past the projection could be critical for performance, since these are sargable predicates which could use indexes if the $match is pushed down. So to summarize, our expectation is that the query is rewritten to something like this:
[ {$match: {$expr: {$and: [{$eq: ["$a", 4]}, {$eq: ["$b", 5]}]}}} {$project: {a: 1, b: 1, c: {$const: "MY_CONST"}}}, {$match: {$expr: {$eq: ["$c", "MY_CONST"]}}} ]
However, if you run the explain as shown above you will see that no $match pushdown occurs. We should fix this.
I found that a slight variation of the same query will behave as expected. This may be a suitable workaround in some contexts:
// Variation where the comparison to a constant is not a $expr pipeline = [ {$project: {a: 1, b: 1, c: {$const: "MY_CONST"}}}, {$match: {$expr: {$and: [{$eq: ["$a", 4]}, {$eq: ["$b", 5]}]}, c: "MY_CONST"}} ]; explain = db.left.explain("executionStats").aggregate(pipeline) printjson(explain);
By expressing the predicate on c as a match expression outside of $expr, the system is able to perform the $match pushdown as expected:
"executionStats" : { "executionSuccess" : true, "nReturned" : 0, "executionTimeMillis" : 3, "totalKeysExamined" : 0, "totalDocsExamined" : 1, "executionStages" : { "isCached" : false, "stage" : "PROJECTION_DEFAULT", "nReturned" : 0, "executionTimeMillisEstimate" : 0, "works" : 2, "advanced" : 0, "needTime" : 1, "needYield" : 0, "saveState" : 2, "restoreState" : 1, "isEOF" : 1, "transformBy" : { "_id" : true, "a" : true, "b" : true, "c" : { "$const" : "MY_CONST" } }, "inputStage" : { "stage" : "COLLSCAN", "filter" : { "$and" : [ { "$expr" : { "$and" : [ { "$eq" : [ "$a", { "$const" : 4 } ] }, { "$eq" : [ "$b", { "$const" : 5 } ] } ] } } ] }, "nReturned" : 0, "executionTimeMillisEstimate" : 0, "works" : 2, "advanced" : 0, "needTime" : 1, "needYield" : 0, "saveState" : 2, "restoreState" : 1, "isEOF" : 1, "direction" : "forward", "docsExamined" : 1 } } } }, "nReturned" : NumberLong(0), "executionTimeMillisEstimate" : NumberLong(3) }, { "$match" : { "c" : { "$eq" : "MY_CONST" } }, "nReturned" : NumberLong(0), "executionTimeMillisEstimate" : NumberLong(3) }
While this may appear to be a contrived query, it came up in the context of a relational migration project. Specifically, the application has a view which projects some columns from a base table and adds an additional column with a constant value. Then, there is higher-level view which joins with this lower-level view. When the join is converted into MQL, it is expressed as a $lookup into the view, and $expr is needed to express the conjunctive join predicate – including a join predicate on the constant column! That's how we end up with an $expr matching against a projected constant column, which defeats pushdown of the other predicates as described above.
- is related to
-
SERVER-82570 Bucket-level filters in time-series are translated to SBE as trivially "true"
-
- Closed
-
-
SERVER-83462 Remove more imprecise InternalExpr predicates in residual filters
-
- Closed
-
-
SERVER-106505 $sequentialCache optimization can interfere with $match pushdown in a $lookup sub-pipeline
-
- Backlog
-