Profile the slow path of the sdam rsm

XMLWordPrintableJSON

    • Type: Task
    • Resolution: Won't Do
    • Priority: Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • None
    • Service Arch 2020-03-23, Service Arch 2020-04-06, Service arch 2020-04-20, Service arch 2020-05-04, Service arch 2020-05-18
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      We need performance coverage of the slow path of the RSM, which is engaged when a server selection cannot immediately return an answer.

      Topology: 3 shard / 3 replica each, 1 config, 1 mongos

      Test scenario 1:
      1. start 50% read/50% write workload running
      2. every x ms concurrently choose a server and kill it
      3. measure latency (50, 90, 95 percentile) & throughput of the reads and writes

      Test scenario 2:
      1. start 50% read/50% write workload
      2. every x ms concurrently choose a chunk and move it to another shard
      3. measure latency (50, 90, 95 percentile) throughput of the reads and writes

      Both tests should run long enough that many kills or moveChunks can occur during the workload.

      We want to compare the metrics to the old RSM (now called scanning_replica_set_monitor) and fix any negative impact of the new implementation.

        1. scanning2.perf.dat
          1 kB
          Lamont Nelson
        2. sdam2.perf.dat
          1 kB
          Lamont Nelson

            Assignee:
            Lamont Nelson
            Reporter:
            Lamont Nelson
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: