-
Type: Task
-
Resolution: Won't Do
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
Labels:None
-
Service Arch 2020-03-23, Service Arch 2020-04-06, Service arch 2020-04-20, Service arch 2020-05-04, Service arch 2020-05-18
We need performance coverage of the slow path of the RSM, which is engaged when a server selection cannot immediately return an answer.
Topology: 3 shard / 3 replica each, 1 config, 1 mongos
Test scenario 1:
1. start 50% read/50% write workload running
2. every x ms concurrently choose a server and kill it
3. measure latency (50, 90, 95 percentile) & throughput of the reads and writes
Test scenario 2:
1. start 50% read/50% write workload
2. every x ms concurrently choose a chunk and move it to another shard
3. measure latency (50, 90, 95 percentile) throughput of the reads and writes
Both tests should run long enough that many kills or moveChunks can occur during the workload.
We want to compare the metrics to the old RSM (now called scanning_replica_set_monitor) and fix any negative impact of the new implementation.