[SERVER-82886] Address Aggregation.Unwind microbenchmark regressions from SERVER-80563 Created: 07/Nov/23  Updated: 23/Nov/23  Resolved: 23/Nov/23

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 7.3.0-rc0

Type: Bug Priority: Major - P3
Reporter: Kevin Cherkauer Assignee: Kevin Cherkauer
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File bf30701.sv82886_fix_2.numbers    
Issue Links:
Depends
Backwards Compatibility: Fully Compatible
Operating System: ALL
Sprint: QE 2023-11-13, QE 2023-11-27
Participants:
Linked BF Score: 20

 Description   

About half of the regression is due to known issue SERVER-82641, where there are currently two ProjectStages if the array index is requested. This microbenchmark is executing pipeline

[\{$unwind: {path: "$array", includeArrayIndex: "index"}}]

against documents of the form

{_id: i, array: [1, "some string data", new ObjectId(), null, NumberLong(23), [4, 5], {x: 1}]}

* On my arm64 VWS an optimized build achieves about 275 ops/sec on this.

  • Removing the "includeArrayIndex" piece increases this to about 410 ops/sec.

(410 - 275) / 410 * 100 = 33% regression from the extra ProjectStage, which makes a second copy of the output document.

The second half of the BF regression is probably from ProjectStage being slow in general in SBE, as thesew mostly run in the VM, not native C++, and write projections always make a new copy of the output document. These things are probably not fixable in SBE without writing a new variant of ProjectStage that does not suffer from these issues, since the $unwind at minimum needs one read ProjectStage to get its inputs and one write ProjectStage to add its outputs to the result doc.

UnwindStage itself runs in native C++ but does not do very much work. Most of the work of any $unwind in SBE is due to ProjectStage.

Keeping this ticket open for the time being to look into handling this second half of the regression.

 



 Comments   
Comment by Githook User [ 22/Nov/23 ]

Author:

{'name': 'Kevin Cherkauer', 'email': 'kevin.cherkauer@mongodb.com', 'username': 'kevin-cherkauer'}

Message: SERVER-82886 Fix pushdown Aggregation.Unwind microbenchmark regressions
Branch: master
https://github.com/mongodb/mongo/commit/976ce50f6134789e73c639848b35f10040f0ff4a

Comment by Kevin Cherkauer [ 09/Nov/23 ]

To run the benchmark, from the mongo-perf repo:

python benchrun.py \
  -f testcases/pipelines.js \
  -t 1 \
  --readCmd true \
  --includeFilter 'Aggregation.Unwind' \
  --shell /home/ubuntu/mongo_curr/mongo/build/install/bin/mongo

Generated at Thu Feb 08 06:50:34 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.