Escalation Policies
Define who gets notified, when, and what happens if they don't respond.
Escalation Policies
Define who gets notified, when, and what happens if they don't respond.
How Escalation Works
When an incident triggers, NotifyHero walks through your escalation policy level by level:
- Level 1 — notify immediately (e.g., on-call engineer)
- Wait for acknowledgment
- If no response after the timeout → Level 2 (e.g., team lead)
- Continue until someone acknowledges or all levels are exhausted
Level 1: On-Call Schedule "Backend Primary" → wait 5 min
Level 2: On-Call Schedule "Backend Backup" → wait 10 min
Level 3: User "vp-engineering@your-company.com" → wait 15 min
Level 4: Repeat from Level 1
Creating an Escalation Policy
Go to Settings → Escalation Policies → New Policy.
Each level requires:
- Target — a schedule, a specific user, or a team
- Timeout — minutes to wait before escalating (0 = notify simultaneously with next level)
Targets
| Target Type | Description |
|---|---|
| Schedule | Notify whoever is currently on-call |
| User | Notify a specific person |
| Team | Notify all members of a team |
Multi-Level Escalation
Add as many levels as needed. Common patterns:
Standard (3-level)
L1: Primary on-call → 5 min
L2: Backup on-call → 10 min
L3: Engineering manager → 15 min
Aggressive (for critical services)
L1: Primary on-call → 2 min
L2: Primary + Backup on-call → 3 min
L3: Entire team → 5 min
L4: VP Engineering → 10 min
Flat (small team)
L1: Entire team → 0 min (everyone notified immediately)
Repeat Behavior
At the bottom of the policy, choose what happens after all levels are exhausted:
- Repeat from Level 1 — loops back (recommended for critical services)
- Stop escalating — no further notifications
- Repeat N times — loops a set number of times, then stops
Tip: For production services, always enable repeat. An unacknowledged critical incident should never go silent.
Urgency Rules
Set urgency based on severity or time of day:
By Severity
- Critical / Error → High urgency (phone + SMS)
- Warning / Info → Low urgency (email + Slack)
By Time
- Business hours (09:00–18:00) → Low urgency
- Off-hours → High urgency
Combined
- Critical during off-hours → High urgency, aggressive escalation
- Warning at 2 PM → Low urgency, standard escalation
Fallback Behavior
If an incident goes unacknowledged through all escalation levels and all repeats:
- The incident is marked as Escalation Exhausted
- Optionally notify a fallback target (e.g., a Slack channel or email alias)
- The incident remains open and visible on the dashboard
Configure fallback at Escalation Policy → Advanced → Fallback Target.
Assigning to Services
Each service has exactly one escalation policy. Assign it at Service → Settings → Escalation Policy.
Multiple services can share the same policy. For example, all services owned by the Backend team can use the "Backend Escalation" policy.
Best Practices
- Keep timeouts short for critical services — 2–5 minutes per level
- Use schedules, not users — individual users go on vacation; schedules don't
- Test your policies — trigger a test incident and verify the escalation chain works
- Review quarterly — teams change, so should your policies