Developer Tools: Close System Alerting Gaps in Cloud Control Plane

XMLWordPrintableJSON

    • Developer Tools: Close System Alerting Gaps in Cloud Control Plane
    • Developer Tools
    • Not Needed
    • Hide

      1. How does this change affect a user?
      2. Why would a user want to use this?
      3. What command or series of steps does the user follow to make this change?
      4. What is the expected result?
      5. Can this affect database performance? How?
      6. Is there a minimum or maximum setting for this change (if this is a configurable parameter)?
      7. How could the user break something by using this incorrectly?
      8. Is there an ideal setting or way of using this feature?
      9. Does this feature affect any other settings or parts of the product?
      10. Does this feature affect upgrade / downgrade compatibility?
      11. Does this replace some other thing that we should downplay / retire / remove?
      12. When are docs required? Iteratively throughout the project, at project close/feature release, or whenever the Docs team is able to prioritize after release?

      Show
      1. How does this change affect a user? 2. Why would a user want to use this? 3. What command or series of steps does the user follow to make this change? 4. What is the expected result? 5. Can this affect database performance? How? 6. Is there a minimum or maximum setting for this change (if this is a configurable parameter)? 7. How could the user break something by using this incorrectly? 8. Is there an ideal setting or way of using this feature? 9. Does this feature affect any other settings or parts of the product? 10. Does this feature affect upgrade / downgrade compatibility? 11. Does this replace some other thing that we should downplay / retire / remove? 12. When are docs required? Iteratively throughout the project, at project close/feature release, or whenever the Docs team is able to prioritize after release?
    • To Do
    • None
    • None
    • None
    • Hide

      Add updates here

      Show
      Add updates here
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Epic Summary

      Summary

      In light of an ever-evoling system, recent production outages and issues, and departures of several tenured VP+ engineers, we must ensure that the Cloud control plane for each environment is properly and consistently monitored.

      This epic aims to cover your team's responsibilities as outlined in WRITING-18912 for ensuring consistent alerting across all control plane environments for commercial backing DBs (mms-onprem and IA), all critical systems (frameworks or components), and ensuring PD escalations are tied to all critical alerts and there are no single points of failure on escalation paths.

      It is possible your team already has consistent already for each environment. If you believe so, please use this epic to still conduct an audit of your existing alert to ensure consistency and correctness.

      Please ensure commercial and FedRAMP environments are covered.

            Assignee:
            Unassigned
            Reporter:
            Jack Weir
            None
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              None
              None
              None
              None
              None
              None