[SERVER-54078] [SBE] Improve perf of the bestbuy_agg_merge_wordcount benchmark suite Created: 27/Jan/21  Updated: 29/Oct/23  Resolved: 23/Jul/21

Status: Closed
Project: Core Server
Component/s: Querying
Affects Version/s: None
Fix Version/s: 5.0.2

Type: Task Priority: Major - P3
Reporter: Drew Paroski Assignee: Mohammad Dashti (Inactive)
Resolution: Fixed Votes: 0
Labels: qexec-team, sbe-post-v1, sbe-rollout
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: PNG File Screen Shot 2021-07-22 at 5.50.49 AM.png    
Issue Links:
Duplicate
is duplicated by SERVER-54079 Improve SBE perf for bestbuy_agg_merg... Closed
is duplicated by SERVER-54089 Improve SBE perf for bestbuy_agg_merg... Closed
is duplicated by SERVER-54092 Improve SBE perf for bestbuy_agg_merg... Closed
Related
is related to SERVER-57741 SBE should order projection fields ah... Closed
is related to SERVER-27744 merge sequential $project stages and ... Backlog
is related to SERVER-51655 Investigate sys-perf benchmark perfor... Closed
is related to SERVER-54423 Re-run the sys-perf/bestbuy benchmarks Closed
Backwards Compatibility: Fully Compatible
Sprint: Query Execution 2021-05-03, Query Execution 2021-05-17, Query Execution 2021-05-31, Query Execution 2021-06-14, Query Execution 2021-06-28, Query Execution 2021-07-12, Query Execution 2021-07-26
Participants:

 Description   

When SBE is enabled, the aggregate() commands from the filtered_word_count_no_merge, filtered_word_count, all_word_count_no_merge, and all_word_count benchmarks (from the bestbuy_agg_merge_wordcount suite) are noticeably slower vs. the classic execution engine.

Instructions on how to run the aggregate() commands:

1) Download the bestbuy dataset.

2) Start mongod. If you want to enabled SBE, pass '--setParameter "featureFlagSBE=true"' on the command line.

3) Load the bestbuy dataset using the following command:

mongorestore --gzip --archive=bestbuyproducts.bson.gz

4) Launch the mongo shell and run the following commands to prepare your environment:

> function millis(command) { const t1 = new Date(); const result = command(); const t2 = new Date(); print("time: " + (t2 - t1) + "ms"); return (t2-t1); }
> use bestbuy

5) Run the following commands multiple times to measure the performance of the aggregate() commands:

> ## filtered_word_count_no_merge:
> millis(() => db.products.aggregate([{"$match":{"type":{"$in":["Software","Game"]}}},{"$project":{"wordOfName":{"$split":["$name"," "]}}},{"$unwind":"$wordOfName"},{"$group":{"_id":"$wordOfName","count":{"$sum":1.0}}}]).itcount());

> ## filtered_word_count:
> db.target_range_id.remove({})
> millis(() => db.products.aggregate([{"$match":{"type":{"$in":["Software","Game"]}}},{"$project":{"wordOfName":{"$split":["$name"," "]}}},{"$unwind":"$wordOfName"},{"$group":{"_id":"$wordOfName","count":{"$sum":1.0}}},{"$merge":{"into":"target_range_id","on":"_id","whenMatched":"replace","whenNotMatched":"insert"}}]).itcount());

> ## all_word_count_no_merge:
> millis(() => db.products.aggregate([{"$project":{"wordOfDesc":{"$split":["$longDescription"," "]}}},{"$unwind":"$wordOfDesc"},{"$group":{"_id":"$wordOfDesc","count":{"$sum":1.0}}}]).itcount())

> ## all_work_count:
> db.target_range_id.remove({})
> millis(() => db.products.aggregate([{"$project":{"wordOfDesc":{"$split":["$longDescription"," "]}}},{"$unwind":"$wordOfDesc"},{"$group":{"_id":"$wordOfDesc","count":{"$sum":1.0}}},{"$merge":{"into":"target_range_id","on":"_id","whenMatched":"replace","whenNotMatched":"insert"}}]).itcount())

Here is how long each of the aggregate() commands above took on my local machine when SBE was enabled vs. when SBE was disabled (as of 4/23/2021):
  308ms (SBE) vs. 150ms (classic) – filtered_word_count_no_merge
  663ms (SBE) vs. 502ms (classic) – filtered_word_count
  23979ms (SBE) vs. 11602ms (classic) – all_word_count_no_merge
  38506ms (SBE) vs. 25419ms (classic) – all_word_count

Note: The examples above will print out the total amount of time that each "aggregate(..).itcount()" shell command takes from the client's perspective. If you are interested in how long each individual mongo command takes on the server, consider using the slow query timer.



 Comments   
Comment by Mohammad Dashti (Inactive) [ 23/Jul/21 ]

So far, for the benchmarks mentioned in this issue, here are the performance improvements in SBE after applying 5de138 and 28812:

- SBE Performance Regression compared to the Classic Engine (Before 5de138 and 28812) SBE Performance Regression compared to the Classic Engine (After 5de138 and 28812)
filtered_word_count_no_merge 105% 19%
filtered_word_count 32% 6%
all_word_count_no_merge 106% 14%
all_word_count 51% 9%

As you see, there were significant improvements done up to this point.

I did some additional investigations via VTune and these seem to be the remaining sources of performance difference between SBE and the classic engine (for the benchmarks mentioned in this issue):

  1. Different query plans used between SBE and the classic engine
    • Currently, the plans used between SBE and the classic engine are not easily comparable and there are other efforts in optimizing the generated SBE plans. Then, it doesn't make sense to get into the details of this kind of plan inefficiencies in this issue.
  2. Inefficient (temporary) plumbing used for connecting SBE to the rest of the execution engine
    • As SBE does not cover the whole MQL (at the moment), there's some plumbing in-place that consumes partial queries handled by SBE into the rest of the query execution engine. Usually, this means that we need to do an extra result materialization in the boundary between SBE and the rest of the execution engine. It's hard to measure the impact of this materialization (as it's an intrusive measurement itself), but it can (roughly) have a 5-10% impact.
  3. Lack of references between Value s in SBE
    • It's an important design decision in SBE (according to Martin and this document) that SBE Value s own their content and there should not be any sharing or referencing between SBE Value s. For example:
      • an SBE array should own all its elements,
      • an SBE object should own all its fields
      • an SBE string should own all its content. If you have a string value A and want to create a substring B of it (as an SBE Value), then you have to allocate enough memory for B and copy the whole substring from A into B. There shouldn't be any sharing between B and A.
    • There are good reasons for having this constraint on SBE Value s, but it's possible to create an example query that SBE might perform asymptotically slower compared to the classic engine (where the Value s can share the content).
    • Out of curiosity, I worked on adding a new SBE Value type called StringRef on this branch. It's a limited implementation, but unfortunately, did not show any significant performance improvement for the benchmarks mentioned in this issue. Especially, I'd expected all_word_count_no_merge to benefit from this, as it has a split operation on a long string field.
    • The experiment done via this experimental branch is not conclusive and there is some room in the future to work on having special reference Value s in SBE that can share their content with other Value s, while the dependency between the slots that own those values are also maintained.

martin.neupauer and I looked at the current state of this issue, and we came to the conclusion that continuing the work on this issue at this point is probably not the best investment. We've already doubled the performance on some benchmarks mentioned in this issue. The remaining residual difference is not easily attributable to a specific problem in the implementation and is related to the items mentioned above. We can revisit this issue after handling the first two items discussed above (i.e., "Different query plans used between SBE and the classic engine" and "Inefficient (temporary) plumbing used for connecting SBE to the rest of the execution engine").

Comment by Mohammad Dashti (Inactive) [ 22/Jul/21 ]

After merging https://github.com/mongodb/mongo/commit/5de13812ae182fae098fecd5f3a6874fa6b0c749 and https://github.com/mongodb/mongo/commit/28812fce0bd40c9320740f0b9cbcb5f7b686eba7 with master, I used this command to launch a mongod (on master@97e7a659d01f8f5ceef69a4f738cdf76396d99db):

numactl --physcpubind=4,5,6,7 mongod --nojournal --setParameter ttlMonitorEnabled=false --setParameter diagnosticDataCollectionEnabled=false --wiredTigerCacheSizeGB 16 --syncdelay 3600

Then, the following script to run the benchmarks:

#!/bin/sh
 
echo "#############################################################################################"
echo "### filtered_word_count_no_merge"
echo "#############################################################################################"
echo ""
echo "@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@"
echo "> SBE: filtered_word_count_no_merge"
echo "@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@"
for i in 1 2 3 4
do
numactl --physcpubind=1,2,3 -i 0 mongo --eval 'function millis(command) { const t1 = new Date(); const result = command(); const t2 = new Date(); print("time: " + (t2 - t1) + "ms"); return (t2-t1); }; /*using SBE*/ db.adminCommand({setParameter: 1, internalQueryEnableSlotBasedExecutionEngine: true}); /*filtered_word_count_no_merge*/ db.target_range_id.remove({}); millis(() => db.products.aggregate([{"$match":{"type":{"$in":["Software","Game"]}}},{"$project":{"wordOfName":{"$split":["$name"," "]}}},{"$unwind":"$wordOfName"},{"$group":{"_id":"$wordOfName","count":{"$sum":1.0}}}]).itcount());' bestbuy
done
 
echo "@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@"
echo "> classic engine: filtered_word_count_no_merge"
echo "@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@"
for i in 1 2 3 4
do
numactl --physcpubind=1,2,3 -i 0 mongo --eval 'function millis(command) { const t1 = new Date(); const result = command(); const t2 = new Date(); print("time: " + (t2 - t1) + "ms"); return (t2-t1); }; /*using classic engine*/ db.adminCommand({setParameter: 1, internalQueryEnableSlotBasedExecutionEngine: false}); /*filtered_word_count_no_merge*/ db.target_range_id.remove({}); millis(() => db.products.aggregate([{"$match":{"type":{"$in":["Software","Game"]}}},{"$project":{"wordOfName":{"$split":["$name"," "]}}},{"$unwind":"$wordOfName"},{"$group":{"_id":"$wordOfName","count":{"$sum":1.0}}}]).itcount());' bestbuy
done
 
####################################################################################################
 
echo "#############################################################################################"
echo "### filtered_word_count"
echo "#############################################################################################"
echo ""
echo "@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@"
echo "> SBE: filtered_word_count"
echo "@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@"
for i in 1 2 3 4
do
numactl --physcpubind=1,2,3 -i 0 mongo --eval 'function millis(command) { const t1 = new Date(); const result = command(); const t2 = new Date(); print("time: " + (t2 - t1) + "ms"); return (t2-t1); }; /*using SBE*/ db.adminCommand({setParameter: 1, internalQueryEnableSlotBasedExecutionEngine: true}); /*filtered_word_count*/ db.target_range_id.remove({}); millis(() => db.products.aggregate([{"$match":{"type":{"$in":["Software","Game"]}}},{"$project":{"wordOfName":{"$split":["$name"," "]}}},{"$unwind":"$wordOfName"},{"$group":{"_id":"$wordOfName","count":{"$sum":1.0}}},{"$merge":{"into":"target_range_id","on":"_id","whenMatched":"replace","whenNotMatched":"insert"}}]).itcount());' bestbuy
done
 
echo "@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@"
echo "> classic engine: filtered_word_count"
echo "@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@"
for i in 1 2 3 4
do
numactl --physcpubind=1,2,3 -i 0 mongo --eval 'function millis(command) { const t1 = new Date(); const result = command(); const t2 = new Date(); print("time: " + (t2 - t1) + "ms"); return (t2-t1); }; /*using classic engine*/ db.adminCommand({setParameter: 1, internalQueryEnableSlotBasedExecutionEngine: false}); /*filtered_word_count*/ db.target_range_id.remove({}); millis(() => db.products.aggregate([{"$match":{"type":{"$in":["Software","Game"]}}},{"$project":{"wordOfName":{"$split":["$name"," "]}}},{"$unwind":"$wordOfName"},{"$group":{"_id":"$wordOfName","count":{"$sum":1.0}}},{"$merge":{"into":"target_range_id","on":"_id","whenMatched":"replace","whenNotMatched":"insert"}}]).itcount());' bestbuy
done
 
####################################################################################################
 
echo "#############################################################################################"
echo "### all_word_count_no_merge"
echo "#############################################################################################"
echo ""
echo "@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@"
echo "> SBE: all_word_count_no_merge"
echo "@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@"
for i in 1 2 3 4
do
numactl --physcpubind=1,2,3 -i 0 mongo --eval 'function millis(command) { const t1 = new Date(); const result = command(); const t2 = new Date(); print("time: " + (t2 - t1) + "ms"); return (t2-t1); }; /*using SBE*/ db.adminCommand({setParameter: 1, internalQueryEnableSlotBasedExecutionEngine: true}); /*all_word_count_no_merge*/ db.target_range_id.remove({}); millis(() => db.products.aggregate([{"$project":{"wordOfDesc":{"$split":["$longDescription"," "]}}},{"$unwind":"$wordOfDesc"},{"$group":{"_id":"$wordOfDesc","count":{"$sum":1.0}}}]).itcount());' bestbuy
done
 
echo "@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@"
echo "> classic engine: all_word_count_no_merge"
echo "@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@"
for i in 1 2 3 4
do
numactl --physcpubind=1,2,3 -i 0 mongo --eval 'function millis(command) { const t1 = new Date(); const result = command(); const t2 = new Date(); print("time: " + (t2 - t1) + "ms"); return (t2-t1); }; /*using classic engine*/ db.adminCommand({setParameter: 1, internalQueryEnableSlotBasedExecutionEngine: false}); /*all_word_count_no_merge*/ db.target_range_id.remove({}); millis(() => db.products.aggregate([{"$project":{"wordOfDesc":{"$split":["$longDescription"," "]}}},{"$unwind":"$wordOfDesc"},{"$group":{"_id":"$wordOfDesc","count":{"$sum":1.0}}}]).itcount());' bestbuy
done
 
####################################################################################################
 
echo "#############################################################################################"
echo "### all_word_count"
echo "#############################################################################################"
echo ""
echo "@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@"
echo "> SBE: all_word_count"
echo "@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@"
for i in 1 2 3 4
do
numactl --physcpubind=1,2,3 -i 0 mongo --eval 'function millis(command) { const t1 = new Date(); const result = command(); const t2 = new Date(); print("time: " + (t2 - t1) + "ms"); return (t2-t1); }; /*using SBE*/ db.adminCommand({setParameter: 1, internalQueryEnableSlotBasedExecutionEngine: true}); /*all_word_count*/ millis(() => db.products.aggregate([{"$project":{"wordOfDesc":{"$split":["$longDescription"," "]}}},{"$unwind":"$wordOfDesc"},{"$group":{"_id":"$wordOfDesc","count":{"$sum":1.0}}},{"$merge":{"into":"target_range_id","on":"_id","whenMatched":"replace","whenNotMatched":"insert"}}]).itcount());' bestbuy
done
 
echo "@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@"
echo "> classic engine: all_word_count"
echo "@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@"
for i in 1 2 3 4
do
numactl --physcpubind=1,2,3 -i 0 mongo --eval 'function millis(command) { const t1 = new Date(); const result = command(); const t2 = new Date(); print("time: " + (t2 - t1) + "ms"); return (t2-t1); }; /*using classic engine*/ db.adminCommand({setParameter: 1, internalQueryEnableSlotBasedExecutionEngine: false}); /*all_word_count*/ millis(() => db.products.aggregate([{"$project":{"wordOfDesc":{"$split":["$longDescription"," "]}}},{"$unwind":"$wordOfDesc"},{"$group":{"_id":"$wordOfDesc","count":{"$sum":1.0}}},{"$merge":{"into":"target_range_id","on":"_id","whenMatched":"replace","whenNotMatched":"insert"}}]).itcount());' bestbuy
done

Here's the outcome on my Ubuntu Evergreen workstation:

#############################################################################################
### filtered_word_count_no_merge
#############################################################################################
 
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
> SBE: filtered_word_count_no_merge
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
time: 142ms
time: 137ms
time: 137ms
time: 134ms
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
> classic engine: filtered_word_count_no_merge
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
time: 119ms
time: 116ms
time: 115ms
time: 115ms
#############################################################################################
### filtered_word_count
#############################################################################################
 
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
> SBE: filtered_word_count
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
time: 346ms
time: 360ms
time: 359ms
time: 353ms
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
> classic engine: filtered_word_count
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
time: 335ms
time: 336ms
time: 340ms
time: 335ms
#############################################################################################
### all_word_count_no_merge
#############################################################################################
 
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
> SBE: all_word_count_no_merge
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
time: 12408ms
time: 12384ms
time: 12400ms
time: 12419ms
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
> classic engine: all_word_count_no_merge
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
time: 10806ms
time: 11085ms
time: 10859ms
time: 10867ms
#############################################################################################
### all_word_count
#############################################################################################
 
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
> SBE: all_word_count
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
time: 19972ms
time: 17727ms
time: 17922ms
time: 17754ms
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
> classic engine: all_word_count
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
time: 16103ms
time: 16242ms
time: 16080ms
time: 16157ms

Here's the same data in CSV format:

Type, Benchmark, Run #1, Run #2, Run #3, Run #4
SBE, filtered_word_count_no_merge,142,137,137,134
classic engine, filtered_word_count_no_merge,119,116,115,115
SBE, filtered_word_count,346,360,359,353
classic engine, filtered_word_count,335,336,340,335
SBE, all_word_count_no_merge,12408,12384,12400,12419
classic engine, all_word_count_no_merge,10806,11085,10859,10867
SBE, all_word_count,19972,17727,17922,17754
classic engine, all_word_count,16103,16242,16080,16157

Then, visualized it in https://docs.google.com/spreadsheets/d/1BH_-R-lYLrKbaEge8KTwJCaEfuu6map2FT5ScCVidZA/edit?usp=sharing

As you see, the gap is much smaller comapred to the one described in the issue. Still, I'll try to see if I can find any other source of performance improvement.

Comment by Githook User [ 14/Jul/21 ]

Author:

{'name': 'Mohammad Dashti', 'email': 'mdashti@gmail.com', 'username': 'mdashti'}

Message: SERVER-54078 [SBE] Exponential Vector Expansion (Instead of Linear Expansion) in SBE Arrays and Objects.
Branch: master
https://github.com/mongodb/mongo/commit/5de13812ae182fae098fecd5f3a6874fa6b0c749

Generated at Thu Feb 08 05:32:36 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.