Investigation: Use tooling to aid in finding low value statistics

XMLWordPrintableJSON

    • Type: Task
    • Resolution: Unresolved
    • Priority: Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • None

      WiredTiger has a huge number of stats, approximately 1100. We'd like to remove a set of them but it's a difficult task as we have little signal on low value stats. One interesting finding in our initial investigation was that older stats are somewhat higher on the candidate list for removal, another take was that stats that are effectively "flat" or never have their value change would also be due for removal.

      Two investigation paths are:

      1. Sort stats by date added to WiredTiger, accounting for renaming / re-categorisation where possible. AI is very good at this
      2. Query fleet wide metrics and determine which stats are unchanging.

      Combine the two data points to try and create a candidate list for stat removal. Note: An unchanged stat isn't necessarily a bad statistic, in fact there are stats that are never changing because they track config artifacts, those stats should probably stay in place. 

      Another interesting point is that data source statisitics are higher value for removal as they reduce the number of stats by a factor of N, where N is the number of data sources MongoDB gathers stats from.

       

            Assignee:
            [DO NOT USE] Backlog - Storage Engines Team
            Reporter:
            Luke Pearson
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated: