[SERVER-85213] Rewrite $sort+$group with $first/$last to use $top/$bottom Created: 14/Jan/24  Updated: 25/Jan/24

Status: Investigating
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Task Priority: Major - P3
Reporter: Ian Boros Assignee: Arun Banala
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Assigned Teams:
Query Integration
Participants:

 Description   

Sometimes users will write queries of the form

[{$sort: {x:1}},
 {$group: {_id: ..., first: {$first: "$y", last: {$last: "$y"}}}}]

 

Actually performing a complete sort is wasteful, since we only need the first and last elements per bucket, not the total order.

 

We should rewrite these queries to the form, with no blocking sort:

[{$group: {_id: ...,
          first: {$top: {orderBy: "$x", output: "$y"},
          last: {$bottom: {orderBy: "$x", output: "$y"}}}}]

 

Other requirements:

1) This optimization should still work if other accumulators are present

2) The optimization should work even if there's a stage between the sort and group, as long as it's correct to do so. (For example, a $addFields)

3) The optimization should for compound group keys

4) It should work for compound sorts

5) This optimization is often applicable to time series, but it is not TS-specific.


Generated at Thu Feb 08 06:57:09 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.