[SERVER-59070] SBE support for $group in 'needsMerge' state Created: 03/Aug/21  Updated: 29/Oct/23  Resolved: 19/Oct/21

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 5.2.0

Type: Task Priority: Major - P3
Reporter: Ian Boros Assignee: Yoon Soo Kim
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Backwards Compatibility: Fully Compatible
Sprint: QE 2021-10-04, QE 2021-10-18, QE 2021-11-01
Participants:
Linked BF Score: 68

 Description   

When a 'global' $group is run that is split across multiple mongods, a different SBE plan must be generated, so that the output may be collected on the merging node.



 Comments   
Comment by Yoon Soo Kim [ 07/Oct/21 ]

Ian's clarification on the scope:

running the mongod-side using SBE and continuing to use the classic engine on mongos for re-grouping the partially accumulated results

Comment by Yoon Soo Kim [ 06/Oct/21 ]

IIUC, the support for merge in $group pushdown is all about how each accumulator merges partial aggregation results from multiple shards at the mongo*S*-side and how each accumulator produces partial aggregation results at the shard-side.

One quite interesting aspect is that for example, $avg, we may need the finalization step and a different accumulation step at the mongo*S*-side, and we need accumulation steps and the finalization steps for both $sum and $count but not the normal $avg finalization step at the shard-side.

Another interesting question is would we return the fully-materialized BSON object for the partial aggregation results or just return the partial aggregation results as a mixed-type array and we find relevant elements by their positions to save some (de)serialization overhead and network bandwidth. For example, $avg, we may want to return [partial_sum, partial_count] instead of {ā€œsā€: partial_sum, ā€œcā€: partial_count} for partial $avg results.

Generated at Thu Feb 08 05:46:16 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.