[SERVER-39938] aggregation $match before $lookup optimization doesn't happen when $expr: $eq is used Created: 04/Mar/19  Updated: 29/Oct/23  Resolved: 31/Mar/21

Status: Closed
Project: Core Server
Component/s: Aggregation Framework
Affects Version/s: 4.0.6
Fix Version/s: 5.0.0-rc0

Type: Bug Priority: Major - P3
Reporter: Asya Kamsky Assignee: Alya Berciu
Resolution: Fixed Votes: 1
Labels: optimization, query-44-grooming
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Problem/Incident
causes SERVER-66072 $match sampling and $group aggregatio... Closed
Related
is related to SERVER-34926 allow $expr with comparison bounded o... Closed
is related to SERVER-39943 Create match expressions for aggregat... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Sprint: Query 2019-12-30, Query Optimization 2021-02-22, Query Optimization 2021-03-08, Query Optimization 2021-03-22, Query Optimization 2021-04-05
Participants:
Case:
Linked BF Score: 120

 Description   

We check if $match has an expression that isn't on "as" and move it before $lookup - if it has an expression on "as.x" we move it into $lookup.

Expected behavior:

db.foo.explain().aggregate({$lookup:{as:"a",from:"b",localField:"x",foreignField:"y"}},{$unwind:"$a"},{$match:{"a.z
":10,x:{$eq:5}}})
{
        "stages" : [
                {
                        "$cursor" : {
                                "query" : {
                                        "x" : {
                                                "$eq" : 5
                                        }
                                },
                                "queryPlanner" : {
                                        "plannerVersion" : 1,
                                        "namespace" : "tpcds10.foo",
                                        "indexFilterSet" : false,
                                        "parsedQuery" : {
                                                "x" : {
                                                        "$eq" : 5
                                                }
                                        },
                                        "winningPlan" : {
                                                "stage" : "EOF"
                                        },
                                        "rejectedPlans" : [ ]
                                }
                        }
                },
                {
                        "$lookup" : {
                                "from" : "b",
                                "as" : "a",
                                "localField" : "x",
                                "foreignField" : "y",
                                "unwinding" : {
                                        "preserveNullAndEmptyArrays" : false
                                },
                                "matching" : {
                                        "z" : {
                                                "$eq" : 10
                                        }
                                }
                        }
                }
        ],
        "ok" : 1
}
 

Using $expr for x:5 breaks the $lookup matching optimization though equality for x:5 still gets moved before $lookup but does not get pushed into cursor:

db.foo.explain().aggregate({$lookup:{as:"a",from:"b",localField:"x",foreignField:"y"}},{$unwind:"$a"},{$match:{"a.z
":10,$expr:{$eq:["$x",5]}}})
{
        "stages" : [
                {
                        "$cursor" : {
                                "query" : {
                                },
                                "queryPlanner" : {
                                        "plannerVersion" : 1,
                                        "namespace" : "tpcds10.foo",
                                        "indexFilterSet" : false,
                                        "parsedQuery" : {
                                        },
                                        "winningPlan" : {
                                                "stage" : "EOF"
                                        },
                                        "rejectedPlans" : [ ]
                                }
                        }
                },
                {
                        "$match" : {
                                "x" : {
                                        "$_internalExprEq" : 5
                                }
                        }
                },
                {
                        "$lookup" : {
                                "from" : "b",
                                "as" : "a",
                                "localField" : "x",
                                "foreignField" : "y",
                                "unwinding" : {
                                        "preserveNullAndEmptyArrays" : false
                                }
                        }
                },
                {
                        "$match" : {
                                "$and" : [
                                        {
                                                "a.z" : {
                                                        "$eq" : 10
                                                }
                                        },
                                        {
                                                "x" : {
                                                        "$_internalExprEq" : 5
                                                }
                                        },
                                        {
                                                "$expr" : {
                                                        "$eq" : [
                                                                "$x",
                                                                {
                                                                        "$const" : 5
                                                                }
                                                        ]
                                                }
                                        }
                                ]
                        }
                }
        ],
        "ok" : 1
 



 Comments   
Comment by Alya Berciu [ 31/Mar/21 ]

Closing as the fix has been merged.

Comment by Githook User [ 31/Mar/21 ]

Author:

{'name': 'Alya Berciu', 'email': 'alyacarina@gmail.com', 'username': 'alyacb'}

Message: SERVER-39938: Pushdown $match on $expr before $lookup
Branch: master
https://github.com/mongodb/mongo/commit/7142cc2a3dfb59212836a6184f4f6c0e6385f104

Comment by David Storch [ 17/Dec/19 ]

If I understand correctly, the patch I'm working on for SERVER-44258 would address this. I'll try to confirm once I have a satisfactory fix for SERVER-44258.

Comment by James Wahlin [ 05/Mar/19 ]

Currently when aggregation pipeline optimization takes place, we first optimize for relationships in the pipeline (splitting, merging and moving stages) and then perform stage-specific optimization. This removes any $match with $expr/$eq from consideration for split/merge/move as we will not consider its pre-optimized form (meaning prior to rewrite of the $expr:$eq to $and: [$expr:$eq, $_internalEqMatchExpression]).

Additionally for the example of above, we would only consider pushing the {"a.z":10} match into $lookup, if the $expr has been split and moved as we only considering pushdown for the entire MatchExpression and will not split for this case alone.

Both of the above could be fixed if we allowed a $expr MatchExpression to be split/merged/moved prior to optimization. This solution would also optimize $expr statements with comparisons other than $eq, pushing the comparison earlier.

An alternative fix would be to optimize $match with $expr prior to the pipeline-level optimization, allowing the split & move to take place. This would still leave us with the $lookup pushdown issue, as only $internEqMatchExpression would be moved leaving the $expr paired with the "a.z":10 statement.

Comment by James Wahlin [ 04/Mar/19 ]

Currently the only aggregation expression that we will rewrite to the match language is $eq. Without such a rewrite, expressions embedded in a $expr will not be eligible for use in our optimizations. This includes both repositioning match statements in the pipeline and push to the query engine.

While in most cases we would not be able push range based operators such as $gt and $lt to the query engine (due to differences in type bracketing between match and aggregation) it would be fine to reposition in the pipeline when valid. asya - I created SERVER-39943 for this issue as it is separate from the $eq issue reported. I will reduce the scope of the current ticket to investigate the $eq issue.

Generated at Thu Feb 08 04:53:33 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.