Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-8664

Rework the monitoring component and statistics in the cpp suite

    • Type: Icon: Improvement Improvement
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • WT11.0.0, 5.3.0
    • Affects Version/s: None
    • Component/s: None
    • Labels:
      None
    • 5
    • Storage - Ra 2022-01-24

      Summary

      We want to refactor the way statistics are defined and handled by the runtime monitor in the cpp suite. This motivation comes from the ticket WT-8624 where we would like to print plots from the stats. As of today, we have statistics checked at run time and statistics checked after the test has run. Those statistics are defined in a cpp suite configuration file and have a very different syntax while they probably don't need to.

      The ticket aims at:

      • Refactoring the code so it is easier to manipulate statistics.
      • We should create a class "statistics" that defines what a statistic is (i.e name, min/max values, ...).
      • We should be able to indicate whether a stat needs to be checked at run time or after the test has run.
      • Each child class of the "statistics" class should implement a function "check" that retrieves the statistic value and compare it against its limits.
        • The derived classes could be: cache_limit_statistic, db_size_statistic and postrun_statistics.
      • We should update the file test_data.py as well to allow the configuration to be defined differently. For instance, we could define a stat this way:
      runtime_monitor=
      (
          stat_cache_size=
          (
              min=10,
              max=1000,
              runtime=true,
          ),
          stat_db_size=
          (
              max=100000,
              runtime=true,
              postrun=true, 
          ),
      ) 
      • This means stat_cache_size needs to be between the values 10 and 1000 and this stat is checked at runtime. The stat stat_db_size cannot exceed the value 100000, is checked at runtime and post run.

      Motivation

      Does this affect any team outside of WT?

      No

      How likely is it that this use case or problem will occur?

      N/A

      If the problem does occur, what are the consequences and how severe are they?

      N/A

      Is this issue urgent?

      It is becoming urgent as we have every now and then failure related to the current statistics we have set in the cpp suite but we are not sure if they are real issues. By doing this ticket, we will make WT-8624 easier, hence we will be able to print plots that will give us trends, making the analysis of those "failures" easier.

      Acceptance Criteria (Definition of Done)

      The code has been refactored and the above has been implemented.

      Testing

      All cpp tests are OK.

      Documentation update

      N/A

      [Optional] Suggested Solution

      See summary for guidance.

            Assignee:
            etienne.petrel@mongodb.com Etienne Petrel
            Reporter:
            etienne.petrel@mongodb.com Etienne Petrel
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated:
              Resolved: