Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-73237

Collect PSI (Pressure Stall Information) in FTDC

    • Type: Icon: Improvement Improvement
    • Resolution: Duplicate
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • None
    • Server Security

      Hello,

      Recent Kernel versions include PSI (Pressure Stall Information) that is very useful to understand the pressure on resources like CPU, Memory and Storage.

      The psi feature identifies and quantifies the disruptions caused by such resource crunches and the time impact it has on complex workloads or even entire systems.

      We only need to read `/proc/pressure/resource_name` where resource_name can be cpu, memory and storage.

      This information would be really helpful when doing analysis for our customers, since it would give us a good metric on the resources pressure prior to an event that is being investigated.

      In the future, once this is implemented, we could even graph it in our tools and Atlas interface, to understand if the clusters are close to the stall point or even use that information to alert our customers before the stall itself happens.

      Let me know if you have any question.

      Regards.

            Assignee:
            backlog-server-security [DO NOT USE] Backlog - Security Team
            Reporter:
            miguel.nieto@mongodb.com Miguel Ángel Nieto
            Votes:
            5 Vote for this issue
            Watchers:
            14 Start watching this issue

              Created:
              Updated:
              Resolved: