[SERVER-64612] evaluate micro benchmarks perf for v4 Created: 17/Mar/22  Updated: 15/Dec/22  Resolved: 29/Nov/22

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 6.3.0-rc0

Type: Task Priority: Major - P3
Reporter: Daniel Moody Assignee: Andrew Morrow (Inactive)
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: PNG File Screenshot 2022-11-21 at 2.22.36 PM.png    
Issue Links:
Depends
Related
Backwards Compatibility: Fully Compatible
Participants:
Linked BF Score: 4

 Description   

In SERVER-62993, we were able to evaluate v4 performance against the sys-perf project but not the micro benchmarks project due to lack of support for v4 in that project. Once micro benchmarks updates the platform (BUILD-14593) we should run v4 toolchain to check for any significant perf changes.



 Comments   
Comment by Sviatlana Zuiko [ 22/Nov/22 ]

andrew.morrow@mongodb.com, alexander.neben@mongodb.com,

  • on the performance - 20% improvement in pipeline-updates/ PipelineUpdate.MmsSetDeepDistinctPaths across all different variants.
  • on the sys-perf - it narrowed down to only one "regression" 20% increase in latency on filter_with_complex_logical.../ MatchExpressionTwoHundredClauseRootedAnd.Crud only on one builder linux_standalone on a new workload.
    Yes, we would be willing to accept all the changes in sys-perf (if any).
Comment by Sviatlana Zuiko [ 22/Nov/22 ]

acm, all things cleared up (for some cases from the third run ) except:

linux_standalone/ filter_with_complex_logical.../ MatchExpressionTwoHundredClauseRootedAnd.Crud - 20% increase in latency (regression) - still declares regression
but it's a new workload initiated on Nov so probably it's just establishing its stable region.

So the conclusion is we are good to go. Sorry that it took a day to state.

Comment by Sviatlana Zuiko [ 21/Nov/22 ]

UPDATE:
linux_3_node_replSet/ change_streams_latency/ lookup_1c_avg_latency - 79% regression in ops_per_sec - noise
linux_3_node_replSet/ update_with_secondary/ InitializeDatabase.DatabaseOperation.0.0 - 20% regression in ops_per_sec - noise
linux_3_node_replSet/ parallel_insert_replica/ ParallelInsert-32.Insert_W1_JTrue.10 - 20% regression in OperationThroughput - noise
linux_3_node_replSet/ big_update/ Loader.TotalBulkInsert - 20% regression in OperationThroughput - noise
linux_3_node_replSet/ tsbs_query/ lastpoint - 20% regression - still declares regression
linux_3_node_replSet/ mixed_workloads/ mixed_findOne - 20% regression in ops_per_sec - noise
linux_standalone/ contention_ttl_deletions/ InsertData.IndexBuild - 15% decrease in latency (improvement) - still declares improvement
linux_standalone/ filter_with_complex_logical.../ MatchExpressionTwoHundredClauseRootedAnd.Crud - 20% increase in latency (regression) - still declares regression
linux_1_node_replSet/ tsbs_query/ groupby-orderby-limit - 20% improvement in ops_per_sec - noise
linux_1_node_replSet/ llt_mixed_small/ Medium.Query.Baseline.findOne, Short.Update.Baseline.updateOne, Short.Update.Baseline.Crud, Long.Query.Baseline.findOne - ~16% improvement in OperationThroughput - is in progress
linux_1_node_replSet/ change_streams_throughput/ 0_1c_delete - 78% regression in ops_per_sec - noise
linux_1_node_replSet/ tpch_1_denormalized/ TPCHDenormalizedQuery2Cold.Query2 - 16% increase in latency - is in progress

Comment by Sviatlana Zuiko [ 21/Nov/22 ]

acm,
As a result of the performance patch build analysis, I can say for sure it should bring ~20% improvement in pipeline-updates/ PipelineUpdate.MmsSetDeepDistinctPaths in performance.

As for sys-perf analysis, the results are not deterministic, there are lots of deviations of the stable region but I couldn't see consistency across any tasks to confirm they are real.
linux_3_node_replSet/ change_streams_latency/ lookup_1c_avg_latency - 79% regression in ops_per_sec
linux_3_node_replSet/ update_with_secondary/ InitializeDatabase.DatabaseOperation.0.0 - 20% regression in ops_per_sec
linux_3_node_replSet/ parallel_insert_replica/ ParallelInsert-32.Insert_W1_JTrue.10 - 20% regression in OperationThroughput
linux_3_node_replSet/ big_update/ Loader.TotalBulkInsert - 20% regression in OperationThroughput
linux_3_node_replSet/ tsbs_query/ lastpoint - 20% regression
linux_3_node_replSet/ mixed_workloads/ mixed_findOne - 20% regression in ops_per_sec
linux_standalone/ contention_ttl_deletions/ InsertData.IndexBuild - 15% decrease in latency (improvement)
linux_standalone/ filter_with_complex_logical.../ MatchExpressionTwoHundredClauseRootedAnd.Crud - 20% increase in latency (regression)
linux_1_node_replSet/ tsbs_query/ groupby-orderby-limit - 20% improvement in ops_per_sec
linux_1_node_replSet/ llt_mixed_small/ Medium.Query.Baseline.findOne, Short.Update.Baseline.updateOne, Short.Update.Baseline.Crud, Long.Query.Baseline.findOne - ~16% improvement in OperationThroughput
linux_1_node_replSet/ change_streams_throughput/ 0_1c_delete - 78% regression in ops_per_sec
linux_1_node_replSet/ tpch_1_denormalized/ TPCHDenormalizedQuery2Cold.Query2 - 16% increase in latency

To make sys-perf results more deterministic, I can restart arguable tasks and check out results after all - let me know whether we are willing to wait for another ~2-4 hours.

Generated at Thu Feb 08 06:00:44 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.