-
Type:
Bug
-
Resolution: Fixed
-
Priority:
Major - P3
-
Affects Version/s: 8.2.1, 7.0.25, 8.0.16
-
Component/s: Query Planning
-
Query Optimization
-
Fully Compatible
-
ALL
-
v8.2, v8.0, v7.0
-
None
-
None
-
None
-
None
-
None
-
None
-
None
The problem is best explained by example. Imagine we have the following collection:
> db.foo.find()
{ "_id" : 1, "a" : true, "b" : false }
{ "_id" : 2, "a" : false, "b" : true }
The following query renames field "a" to field "b" and then matches documents where "b" (the result of the rename) contains a truthy value. It should return the document with _id:1 but instead incorrectly returns the document with _id:2:
> db.foo.aggregate([{$project: {_id: 1, b: "$a"}}, {$match: {$expr: "$b"}}])
{ "_id" : 2, "b" : false }
The result is clearly inconsistent with the query, since the query is only supposed to return documents containing a truthy value in field "b" – but the returned document has "b": false! This bug means that for affected queries the system can 1) fail to return documents which should be returned, 2) include extra documents in the result set which should be filtered out, and 3) return results which are logically inconsistent with the query.
The buggy behavior is the result of an incorrect pushdown of the $match past the $project. This pushdown can be done so long as the rename is taken into account. That is, after the $match pushdown we should get a query like this:
> db.foo.aggregate([{$match: {$expr: "$a"}}, {$project: {_id: 1, b: "$a"}}])
{ "_id" : 1, "b" : true }
However, if we look at the explain of the original query, we can see that the pushdown has occurred without correctly accounting for the rename:
> db.foo.explain().aggregate([{$project: {_id: 1, b: "$a"}}, {$match: {$expr: "$b"}}])
{
"explainVersion" : "1",
"queryPlanner" : {
"namespace" : "test.foo",
"parsedQuery" : {
"$expr" : "$b"
},
"indexFilterSet" : false,
"queryHash" : "40171405",
"planCacheShapeHash" : "40171405",
"planCacheKey" : "119C707A",
"optimizationTimeMillis" : 0,
"optimizedPipeline" : true,
"maxIndexedOrSolutionsReached" : false,
"maxIndexedAndSolutionsReached" : false,
"maxScansToExplodeReached" : false,
"prunedSimilarIndexes" : false,
"winningPlan" : {
"isCached" : false,
"stage" : "PROJECTION_DEFAULT",
"transformBy" : {
"_id" : true,
"b" : "$a"
},
"inputStage" : {
"stage" : "COLLSCAN",
"filter" : {
"$expr" : "$b" // **This should be "$a"!**
},
"direction" : "forward"
}
},
"rejectedPlans" : [ ]
},
...
Myself and hana.pearlman@mongodb.com discovered this bug through code inspection and unit testing while I was working on SERVER-106505. I've explicitly confirmed that it affects all production branches (8.2, 8.0, and 7.0).
- is related to
-
SERVER-106505 $sequentialCache optimization can interfere with $match pushdown in a $lookup sub-pipeline
-
- Closed
-
- related to
-
SERVER-113935 Write a targeted PBT to test $match-splitting optimizations
-
- Backlog
-