Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-7336

Million collection: Rewrite a new many-collection-test that suits discovering stalls

    • Type: Icon: Improvement Improvement
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • Labels:
      None
    • 3
    • Storage - Ra 2021-04-05

      Larger goal:
      The test relies on an external java load generator, and we do not get feedback on the individual operation latency. Unfortunately, we will not be able to get latency information to per-op granularity. The test extracts the latency and throughput from the serverStatus every second and then averages them over the whole test duration. We can do better here. We can use the per-second information to find the worst seconds for latency and throughput. It will be useful to check if and how that correlates to a running checkpoint, again detectable through serverStatus. We can potentially use statistics from WiredTiger. We have histograms for operational latency, we can also add the "longest operation until now" statistic which gets reset with the collection. I will have to spend more time understanding how serverStatus gets some of these stats from WiredTiger and if we can use them.

      Immediate goal:
      Find immediate means to at-least get the worst latency and throughput over operations that happen every second.

      Rest of the work to be scheduled later through WT-7344.

      Update:
      Use the ticket for writing a new test that aligns with the PM-1407's goal to find and reduce stalls with many collections.

            Assignee:
            sulabh.mahajan@mongodb.com Sulabh Mahajan
            Reporter:
            sulabh.mahajan@mongodb.com Sulabh Mahajan
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: