Details
-
Task
-
Resolution: Unresolved
-
Major - P3
-
None
-
None
-
None
-
None
-
Query Integration
Description
Sometimes users will write queries of the form
[{$sort: {x:1}}, |
{$group: {_id: ..., first: {$first: "$y", last: {$last: "$y"}}}}] |
Actually performing a complete sort is wasteful, since we only need the first and last elements per bucket, not the total order.
We should rewrite these queries to the form, with no blocking sort:
[{$group: {_id: ...,
|
first: {$top: {orderBy: "$x", output: "$y"}, |
last: {$bottom: {orderBy: "$x", output: "$y"}}}}] |
Other requirements:
1) This optimization should still work if other accumulators are present
2) The optimization should work even if there's a stage between the sort and group, as long as it's correct to do so. (For example, a $addFields)
3) The optimization should for compound group keys
4) It should work for compound sorts
5) This optimization is often applicable to time series, but it is not TS-specific.