[SERVER-23732] Aggregation should optimize an irrelevant $sort preceding a $group Created: 14/Apr/16  Updated: 06/Dec/22

Status: Backlog
Project: Core Server
Component/s: Aggregation Framework
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Benjamin Murphy Assignee: Backlog - Query Optimization
Resolution: Unresolved Votes: 0
Labels: performance
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to SERVER-18287 Restore the "impossible match" heuris... Closed
is related to SERVER-28980 aggregation can subsume $sort into $g... Backlog
is related to SERVER-9507 Optimize $sort+$group+$first pipeline... Closed
Assigned Teams:
Query Optimization
Participants:

 Description   

At the moment, we execute the following pipeline:
{$sort: {a: 1}} {$group: {_id: "$b"}}
Despite the fact that the $sort has no effect upon the $group.

We should change DocumentSourceSort::optimizeAt() to remove itself if the following stage is a $group that doesn't use any of the sort's keys.



 Comments   
Comment by Asya Kamsky [ 25/Oct/17 ]

This optimization must not happen if $sort is able to use an index via pushdown into query subsystem.

Also if following $group does anything with full document (i.e. $push "$$ROOT" into an array) then that should be considered as "using" sort keys as the sort will determine the order of the documents in that array.

Comment by J Rassi [ 15/Apr/16 ]

FWIW, I'm somewhat skeptical about making the server responsible for removing predicates or pipeline stages that have no effect on the result set. These kinds of optimizations are trivial for users to apply themselves; doing it on the server side encourages poor application design patterns, requires server code that has to be maintained, and slows down well-written application code by adding additional run-time analysis.

Generated at Thu Feb 08 04:04:19 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.