$expr equality predicate on a constant projected column defeats match pushdown optimization

XMLWordPrintableJSON

    • Type: Improvement
    • Resolution: Works as Designed
    • Priority: Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • Query Execution
    • None
    • 3
    • TBD
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Consider the following example query:

      let pipeline = [
          {$project: {a: 1, b: 1, c: {$const: "MY_CONST"}}},
          {$match: {$expr: {$and: [{$eq: ["$a", 4]}, {$eq: ["$b", 5]}, {$eq: ["$c", "MY_CONST"]}]}}}
      ];
      let explain = db.coll.explain("executionStats").aggregate(pipeline)
      printjson(explain);
      

      Generally, DocumentSource heuristic rewrites are able to push predicates past projections. In this case, the predicates on fields a and b can be pushed beneath the $project stage, since these fields are "simple inclusions". In contrast, the predicate on c cannot be trivially pushed past the $project, because the project adds field c with the constant value "MY_CONST" to every document. Pushing the equalities on a and b past the projection could be critical for performance, since these are sargable predicates which could use indexes if the $match is pushed down. So to summarize, our expectation is that the query is rewritten to something like this:

      [
          {$match: {$expr: {$and: [{$eq: ["$a", 4]}, {$eq: ["$b", 5]}]}}}
          {$project: {a: 1, b: 1, c: {$const: "MY_CONST"}}},
          {$match: {$expr: {$eq: ["$c", "MY_CONST"]}}}
      ]
      

      However, if you run the explain as shown above you will see that no $match pushdown occurs. We should fix this.

      I found that a slight variation of the same query will behave as expected. This may be a suitable workaround in some contexts:

      // Variation where the comparison to a constant is not a $expr
      pipeline = [
          {$project: {a: 1, b: 1, c: {$const: "MY_CONST"}}},
          {$match: {$expr: {$and: [{$eq: ["$a", 4]}, {$eq: ["$b", 5]}]}, c: "MY_CONST"}}
      ];
      explain = db.left.explain("executionStats").aggregate(pipeline)
      printjson(explain);
      

      By expressing the predicate on c as a match expression outside of $expr, the system is able to perform the $match pushdown as expected:

      				"executionStats" : {
      					"executionSuccess" : true,
      					"nReturned" : 0,
      					"executionTimeMillis" : 3,
      					"totalKeysExamined" : 0,
      					"totalDocsExamined" : 1,
      					"executionStages" : {
      						"isCached" : false,
      						"stage" : "PROJECTION_DEFAULT",
      						"nReturned" : 0,
      						"executionTimeMillisEstimate" : 0,
      						"works" : 2,
      						"advanced" : 0,
      						"needTime" : 1,
      						"needYield" : 0,
      						"saveState" : 2,
      						"restoreState" : 1,
      						"isEOF" : 1,
      						"transformBy" : {
      							"_id" : true,
      							"a" : true,
      							"b" : true,
      							"c" : {
      								"$const" : "MY_CONST"
      							}
      						},
      						"inputStage" : {
      							"stage" : "COLLSCAN",
      							"filter" : {
      								"$and" : [
      									{
      										"$expr" : {
      											"$and" : [
      												{
      													"$eq" : [
      														"$a",
      														{
      															"$const" : 4
      														}
      													]
      												},
      												{
      													"$eq" : [
      														"$b",
      														{
      															"$const" : 5
      														}
      													]
      												}
      											]
      										}
      									}
      								]
      							},
      							"nReturned" : 0,
      							"executionTimeMillisEstimate" : 0,
      							"works" : 2,
      							"advanced" : 0,
      							"needTime" : 1,
      							"needYield" : 0,
      							"saveState" : 2,
      							"restoreState" : 1,
      							"isEOF" : 1,
      							"direction" : "forward",
      							"docsExamined" : 1
      						}
      					}
      				}
      			},
      			"nReturned" : NumberLong(0),
      			"executionTimeMillisEstimate" : NumberLong(3)
      		},
      		{
      			"$match" : {
      				"c" : {
      					"$eq" : "MY_CONST"
      				}
      			},
      			"nReturned" : NumberLong(0),
      			"executionTimeMillisEstimate" : NumberLong(3)
      		}
      

      While this may appear to be a contrived query, it came up in the context of a relational migration project. Specifically, the application has a view which projects some columns from a base table and adds an additional column with a constant value. Then, there is higher-level view which joins with this lower-level view. When the join is converted into MQL, it is expressed as a $lookup into the view, and $expr is needed to express the conjunctive join predicate – including a join predicate on the constant column! That's how we end up with an $expr matching against a projected constant column, which defeats pushdown of the other predicates as described above.

              Assignee:
              David Storch
              Reporter:
              David Storch
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

                Created:
                Updated:
                Resolved: