$match pushdown past a renamed field is done incorrectly when the expression root is a renamed field path expression

XMLWordPrintableJSON

    • Query Optimization
    • Fully Compatible
    • ALL
    • v8.2, v8.0, v7.0
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      The problem is best explained by example. Imagine we have the following collection:

      > db.foo.find()
      { "_id" : 1, "a" : true, "b" : false }
      { "_id" : 2, "a" : false, "b" : true }
      

      The following query renames field "a" to field "b" and then matches documents where "b" (the result of the rename) contains a truthy value. It should return the document with _id:1 but instead incorrectly returns the document with _id:2:

      > db.foo.aggregate([{$project: {_id: 1, b: "$a"}}, {$match: {$expr: "$b"}}])
      { "_id" : 2, "b" : false }
      

      The result is clearly inconsistent with the query, since the query is only supposed to return documents containing a truthy value in field "b" – but the returned document has "b": false! This bug means that for affected queries the system can 1) fail to return documents which should be returned, 2) include extra documents in the result set which should be filtered out, and 3) return results which are logically inconsistent with the query.

      The buggy behavior is the result of an incorrect pushdown of the $match past the $project. This pushdown can be done so long as the rename is taken into account. That is, after the $match pushdown we should get a query like this:

      > db.foo.aggregate([{$match: {$expr: "$a"}}, {$project: {_id: 1, b: "$a"}}])
      { "_id" : 1, "b" : true }
      

      However, if we look at the explain of the original query, we can see that the pushdown has occurred without correctly accounting for the rename:

      > db.foo.explain().aggregate([{$project: {_id: 1, b: "$a"}}, {$match: {$expr: "$b"}}])
      {
      	"explainVersion" : "1",
      	"queryPlanner" : {
      		"namespace" : "test.foo",
      		"parsedQuery" : {
      			"$expr" : "$b"
      		},
      		"indexFilterSet" : false,
      		"queryHash" : "40171405",
      		"planCacheShapeHash" : "40171405",
      		"planCacheKey" : "119C707A",
      		"optimizationTimeMillis" : 0,
      		"optimizedPipeline" : true,
      		"maxIndexedOrSolutionsReached" : false,
      		"maxIndexedAndSolutionsReached" : false,
      		"maxScansToExplodeReached" : false,
      		"prunedSimilarIndexes" : false,
      		"winningPlan" : {
      			"isCached" : false,
      			"stage" : "PROJECTION_DEFAULT",
      			"transformBy" : {
      				"_id" : true,
      				"b" : "$a"
      			},
      			"inputStage" : {
      				"stage" : "COLLSCAN",
      				"filter" : {
      					"$expr" : "$b" // **This should be "$a"!**
      				},
      				"direction" : "forward"
      			}
      		},
      		"rejectedPlans" : [ ]
      	},
      ...
      

      Myself and hana.pearlman@mongodb.com discovered this bug through code inspection and unit testing while I was working on SERVER-106505. I've explicitly confirmed that it affects all production branches (8.2, 8.0, and 7.0).

            Assignee:
            David Storch
            Reporter:
            David Storch
            Votes:
            0 Vote for this issue
            Watchers:
            14 Start watching this issue

              Created:
              Updated:
              Resolved: