[SERVER-82689] Investigate updating heap-based $maxN/$topN in SBE group implementation to multiset/multimap Created: 02/Nov/23  Updated: 17/Nov/23  Resolved: 17/Nov/23

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Foteini Alvanaki Assignee: Projjal Chanda
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Assigned Teams:
Query Execution
Sprint: QE 2023-11-27
Participants:

 Description   

The SBE implementation of $minN/$topN in group stage is implemented using a heap. As part of the implementation in SBE of $minN/$topN for the window stage, the multiset/multimap data structure was introduced.  We should replace the heap with multiset/multimap if it has performance gains. 



 Comments   
Comment by Projjal Chanda [ 17/Nov/23 ]

I tested replacing the heap implementation with mutliest for $maxN/$minN (changes: https://spruce.mongodb.com/version/6554f64261837d42989f8a97/changes?sorts=STATUS%3AASC%3BBASE_STATUS%3ADESC) and in the performance workload there was either no change or regression in some maxN/minN tests (https://performance-analyzer.server-tig.prod.corp.mongodb.com/perf-analyzer-viz/?comparison_id=3c54ef7b-88e2-48b2-9746-a96df3cb5f1d&selected_tab=data-table&percent_filter=0%7C%7C100&z_filter=0%7C%7C10)

For $topN/$bottomN the heap implementation uses cheapsortkey which provide perf improvements than the multimap impl which uses the normal sortkey.

As such its better to keep the heap based implementation.

Generated at Thu Feb 08 06:50:00 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.