Incident Response

Severity Levels

  • P0: Service down, all users affected → fix immediately, communicate within 1 hour
  • P1: Major feature broken → fix within 24 hours
  • P2: Minor bug, workaround exists → fix within 1 week

Response Steps

  1. Acknowledge the issue
  2. Assess severity
  3. Communicate to affected users (if P0/P1)
  4. Fix and deploy
  5. Write post-mortem (P0 only)
Last modified: 17 Mar 2026