-
Type: Epic
-
Resolution: Unresolved
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: Data Explorer
-
Developer Tools
-
To Do
-
Developer Tools: Close System Alerting Gaps in Cloud Control Plane
Summary
In light of an ever-evoling system, recent production outages and issues, and departures of several tenured VP+ engineers, we must ensure that the Cloud control plane for each environment is properly and consistently monitored.
This epic aims to cover your team's responsibilities as outlined in WRITING-18912 for ensuring consistent alerting across all control plane environments for commercial backing DBs (mms-onprem and IA), all critical systems (frameworks or components), and ensuring PD escalations are tied to all critical alerts and there are no single points of failure on escalation paths.
It is possible your team already has consistent already for each environment. If you believe so, please use this epic to still conduct an audit of your existing alert to ensure consistency and correctness.
Please ensure commercial and FedRAMP environments are covered.