Incident Response
Severity Levels
- P0: Service down, all users affected → fix immediately, communicate within 1 hour
- P1: Major feature broken → fix within 24 hours
- P2: Minor bug, workaround exists → fix within 1 week
Response Steps
- Acknowledge the issue
- Assess severity
- Communicate to affected users (if P0/P1)
- Fix and deploy
- Write post-mortem (P0 only)
Last modified: 17 Mar 2026