[SERVER-77983] Investigate performance regressions in lookup and graph_lookup workloads with a config shard Created: 12/Jun/23  Updated: 12/Dec/23

Status: Backlog
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Task Priority: Major - P3
Reporter: Wenqin Ye Assignee: Backlog - Cluster Scalability
Resolution: Unresolved Votes: 0
Labels: sharding-nyc-subteam2
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to SERVER-74266 Run existing genny workloads in catal... Closed
Assigned Teams:
Cluster Scalability
Participants:
Story Points: 5

 Description   

As part of SERVER-74266, we ran several genny workloads in config shard mode to evaluate any significant performance regressions. Through that we found several performance regressions in the lookup and graph_lookup workloads.

After some initial investigation, it was not immediately clear whether the regressions were due to issues with the test setup (where the config shard's setup was not the exact same as a regular shard server's) or if there are actual issues with the config shard code. This ticket should investigate the root cause of the observed performance regressions in the lookup and graph_lookup workloads and determine if it's a test setup issue or an issue with the config shard code. 

For reference, here were the results from SERVER-74266 in a spreadsheet:
https://docs.google.com/spreadsheets/d/1l1LwDNAreDKoM6JjE0j2U3mzhgL0kUDOJXsMrZBj3tU/edit#gid=1114981944

Here is an example of the setup used for the genny workloads with a config shard:
https://spruce.mongodb.com/version/6481f437e3c331486c4f3c8d/changes?sorts=STATUS%3AASC%3BBASE_STATUS%3ADESC

 


Generated at Thu Feb 08 06:37:09 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.