[SERVER-46683] Profile the slow path of the sdam rsm Created: 06/Mar/20  Updated: 08/May/20  Resolved: 08/May/20

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Task Priority: Major - P3
Reporter: Lamont Nelson Assignee: Lamont Nelson
Resolution: Won't Do Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File scanning2.perf.dat     File sdam2.perf.dat    
Sprint: Service Arch 2020-03-23, Service Arch 2020-04-06, Service arch 2020-04-20, Service arch 2020-05-04, Service arch 2020-05-18
Participants:

 Description   

We need performance coverage of the slow path of the RSM, which is engaged when a server selection cannot immediately return an answer.

Topology: 3 shard / 3 replica each, 1 config, 1 mongos

Test scenario 1:
1. start 50% read/50% write workload running
2. every x ms concurrently choose a server and kill it
3. measure latency (50, 90, 95 percentile) & throughput of the reads and writes

Test scenario 2:
1. start 50% read/50% write workload
2. every x ms concurrently choose a chunk and move it to another shard
3. measure latency (50, 90, 95 percentile) throughput of the reads and writes

Both tests should run long enough that many kills or moveChunks can occur during the workload.

We want to compare the metrics to the old RSM (now called scanning_replica_set_monitor) and fix any negative impact of the new implementation.


Generated at Thu Feb 08 05:12:09 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.