[SERVER-85122] Collect workload parameters for C2C Perf testing Created: 13/Dec/21  Updated: 12/Jan/24  Resolved: 08/Mar/22

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Task Priority: Major - P3
Reporter: Judah Schvimer Assignee: Alan Zheng
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Participants:

 Description   

In our MOU agreement with Amadeus, there are no requirements defined for how big of a topology, workload, or cluster size MongoDB's Cluster-to-Cluster solution must support. However, Amadeus expects that the Cluster-to-Cluster solution will work with any cluster they have in production. As a result, we hope to determine if our solution will scale to Amadeus' scale and workload. 

 

We have gathered the following information about Amadeus' clusters: 

  • Summary of the ~40 (production) and ~70 (development) on-prem clusters
  • Google Drive with a detailed audit on each cluster
  • Workload 
    • The following list shows the maximum values for each metric over all mongods managed by Amadeus production Ops Manager (averaged to one data point per day): "OPCOUNTER_DELETE"  3880.894012685649
       
      "OPCOUNTER_INSERT" 547.351558876741
      "OPCOUNTER_UPDATE"  12035.873877845057
      "DOCUMENT_METRICS_INSERTED"   328.51541991846045
      "DOCUMENT_METRICS_DELETED" 166.70489563422413
      "DOCUMENT_METRICS_UPDATED" 5072.436333308759
      "OPLOG_RATE_GB_PER_HOUR"    51.537250799999995 
       
       
      Op counters are per second, and the oplog rate per hour is equal to 14.315903 MB per second 



 Comments   
Comment by Alan Zheng [ 24/Feb/22 ]

There is an ongoing health check/audit on all their 40+ production clusters at Amadeus. This is an ongoing process for Amadeus and we do not expect to know everything about every cluster before REP-31 is complete. 

Here is the Google Drive with a detailed audit on each cluster

 

Comment by Alan Zheng [ 24/Feb/22 ]

Workload information from Kai (Jan 2022): 

 

We still have the monitoring data that we collected for January 2021. I now have calculated the maximum values for all the write metrics over all projects and mongod instances. So the following list shows the maximum values for each metric over all mongods managed by Amadeus production Ops Manager  (these are not real max values, as the monitoring data is already averaged to one data point per day):

"OPCOUNTER_DELETE"  3880.894012685649

"OPCOUNTER_INSERT" 547.351558876741

"OPCOUNTER_UPDATE"  12035.873877845057

"DOCUMENT_METRICS_INSERTED"   328.51541991846045

"DOCUMENT_METRICS_DELETED" 166.70489563422413

"DOCUMENT_METRICS_UPDATED" 5072.436333308759

"OPLOG_RATE_GB_PER_HOUR"    51.537250799999995

I am not exactly sure why the document metrics are smaller than the op counters, as this usually should be the other way around. There is probably something missing in the data.

Op counters are per second, and the oplog rate per hour is equal to 14.315903 MB per second

Comment by Alan Zheng [ 26/Jan/22 ]

Attached is a summary of the use case and information gathered from Amadeus

https://docs.google.com/document/d/1F_qQAFw04zUj-loph54JR8qqxmsp_5kCj7Du9DodHsw/edit?usp=sharing

Generated at Thu Feb 08 06:56:53 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.