⭐ Repository → 💼 AWS Well-Architected → 💼 Operational Excellence → 💼 Operate
💼 Responding to events
- ID:
/frameworks/aws-well-architected/operational-excellence/operate/ops10
Description
You should anticipate operational events, both planned (for example, sales promotions, deployments, and failure tests) and unplanned (for example, surges in utilization and component failures). You should use your existing runbooks and playbooks to deliver consistent results when you respond to alerts. Defined alerts should be owned by a role or a team that is accountable for the response and escalations. You will also want to know the business impact of your system components and use this to target efforts when needed. You should perform a root cause analysis (RCA) after events, and then prevent recurrence of failures or document workarounds.
Similar
Sub Sections
Section | Sub Sections | Internal Rules | Policies | Flags | Compliance |
---|---|---|---|---|---|
💼 OPS10-BP01 Use a process for event, incident, and problem management | no data | ||||
💼 OPS10-BP02 Have a process per alert | no data | ||||
💼 OPS10-BP03 Prioritize operational events based on business impact | no data | ||||
💼 OPS10-BP04 Define escalation paths | no data | ||||
💼 OPS10-BP05 Define a customer communication plan for service-impacting events | no data | ||||
💼 OPS10-BP06 Communicate status through dashboards | no data | ||||
💼 OPS10-BP07 Automate responses to events | no data |