[SERVER-26442] Push $sort before $project and $addFields Created: 03/Oct/16  Updated: 02/Feb/24

Status: Open
Project: Core Server
Component/s: Aggregation Framework, Querying
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: David Storch Assignee: Backlog - Query Optimization
Resolution: Unresolved Votes: 7
Labels: bonsai, optimization, query-44-grooming
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
depends on SERVER-7568 Aggregation framework favors non-bloc... Closed
Duplicate
is duplicated by SERVER-38102 Project stage prevents sort stage fro... Closed
is duplicated by SERVER-59910 Queries with sort+projection+limit fa... Closed
Problem/Incident
Related
related to SERVER-19153 Conditionally push $match before $pro... Closed
related to SERVER-53638 Enable pushdown of config.cache.chunk... Closed
related to SERVER-55417 Conditionally reorder $sort and $look... Backlog
related to SERVER-54822 Push $sort before $unwind when possible Closed
related to SERVER-54823 $sort should be pushed before $lookup... Closed
is related to SERVER-54768 Sort/project re-ordering is inconsist... Backlog
is related to SERVER-15200 Query planner should move projection ... Closed
is related to SERVER-31072 reorder $limit before $lookup in pipe... Closed
Assigned Teams:
Query Optimization
Participants:
Case:

 Description   

In some cases, a $sort aggregation stage can swap with a preceding $project. Consider the following example:

db.c.createIndex({a: 1});
db.c.aggregate([{$project: {a: 1, b: 1}}, {$sort: {a: 1}}]);

An explain of this aggregation shows that a COLLSCAN will feed the $project => $sort pipeline. If the agg optimization phase were to swap the $sort with the $project, however, it could push down the $sort and obtain the desired ordering via an index scan. This would remove the sort stage from the query plan entirely, which could result in a substantial performance improvement.

We would probably want to apply this optimization only if the query planner's sort analysis shows that the sort can be obtained via an appropriate index scan.



 Comments   
Comment by Anshu Avinash [ 05/Jun/23 ]

We have a use case where this would be useful. We are using "$addFields" to dynamically add new fields to every aggregation pipeline. Our aggregation pipeline currently looks like this:
```
{"$addFields": {}, "$match": {}, {"$sort":{ "key": -1 }},{ "$skip": 0 },{"$limit":50}}
```
We have added $addFields as the first step because we can have the same key in match or sort. However that is very rare case and majority of our queries will not have match/sort on these addFields. It will be great if mongodb can optimize this for us. It is already optimizing by moving $addFields after $match, but it is not moving it after $sort and $limit.

Comment by Githook User [ 06/Sep/22 ]

Author:

{'name': 'Ted Tuckman', 'email': 'TedTuckman@users.noreply.github.com', 'username': 'TedTuckman'}

Message: SERVER-26442 Remove extra perf test (#733)
Branch: master
https://github.com/mongodb/genny/commit/1d6089bb331f34b899365045391ad269b7cbf9ef

Comment by Githook User [ 16/Aug/22 ]

Author:

{'name': 'Ted Tuckman', 'email': 'TedTuckman@users.noreply.github.com', 'username': 'TedTuckman'}

Message: SERVER-26442 Add perf test for project sort swap for sort pushdown (#725)
Branch: master
https://github.com/mongodb/genny/commit/b6e51f71dfa599f88216a850d08dc3829d88e04e

Comment by Asya Kamsky [ 30/Sep/19 ]

This is the same when sort is followed by a limit and that makes it even more important to push it down so that we don't return a full batch to aggregation to find a single highest value.

Comment by Danny Hatcher (Inactive) [ 13/Nov/18 ]

This issue also can prevent queries on views from obtaining non-blocking sort plans.

Comment by Asya Kamsky [ 27/Sep/17 ]

I also realized that when we have

{$match:{a:{$gt:1}}},{$project:{a:1,_id:0}},{$sort:{a:1}}

we sort in memory while pushing match and project to the query system, but changing it to

 {$match:{a:{$gt:1}}},{$sort:{a:1}},{$project:{a:1,_id:0}}

all three get pushed into query. That seems strange.

Comment by Asya Kamsky [ 06/Jan/17 ]

$sort can also be pushed in front of $*lookup and $unwind when on fields not involved in being unwound and not coming from lookups.

Generated at Thu Feb 08 04:12:08 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.