Runbook

Runbook - <System / Service Name>

Owner:
Last updated: YYYY-MM-DD
Severity scope: SEV-1 | SEV-2 | SEV-3 (if applicable)

Overview

Briefly describe what this runbook covers and when it should be used.

Example:
Procedure to restart the authentication service in case of partial or total unavailability.

Symptoms

How to recognize that the issue is occurring.

Example:

Quick checks (Initial diagnosis)

Fast steps to confirm the problem and assess impact.

Goal: confirm what is broken and how bad it is in under a few minutes.

Immediate actions (Mitigation)

Optional - use when the goal is to reduce impact while root cause is investigated.

Example:

⚠️ Note any side effects or risks of these actions.

Resolution

Steps to fix the root cause once identified.

Example:

Validation

How to confirm the issue is fully resolved.

Example:

Escalation

Who to contact if the issue cannot be resolved using this runbook.

Include when to escalate (e.g., after X minutes or if SEV-1).

Post-incident follow-up

Checklist to complete after recovery.

(Optional)