NotifyHero Docs

Escalation Policies

Define who gets notified, when, and what happens if they don't respond.

Escalation Policies

Define who gets notified, when, and what happens if they don't respond.


How Escalation Works

When an incident triggers, NotifyHero walks through your escalation policy level by level:

  1. Level 1 — notify immediately (e.g., on-call engineer)
  2. Wait for acknowledgment
  3. If no response after the timeout → Level 2 (e.g., team lead)
  4. Continue until someone acknowledges or all levels are exhausted
Level 1: On-Call Schedule "Backend Primary" → wait 5 min
Level 2: On-Call Schedule "Backend Backup" → wait 10 min
Level 3: User "vp-engineering@your-company.com" → wait 15 min
Level 4: Repeat from Level 1

Creating an Escalation Policy

Go to Settings → Escalation Policies → New Policy.

Each level requires:

  • Target — a schedule, a specific user, or a team
  • Timeout — minutes to wait before escalating (0 = notify simultaneously with next level)

Targets

Target TypeDescription
ScheduleNotify whoever is currently on-call
UserNotify a specific person
TeamNotify all members of a team

Multi-Level Escalation

Add as many levels as needed. Common patterns:

Standard (3-level)

L1: Primary on-call → 5 min
L2: Backup on-call → 10 min
L3: Engineering manager → 15 min

Aggressive (for critical services)

L1: Primary on-call → 2 min
L2: Primary + Backup on-call → 3 min
L3: Entire team → 5 min
L4: VP Engineering → 10 min

Flat (small team)

L1: Entire team → 0 min (everyone notified immediately)

Repeat Behavior

At the bottom of the policy, choose what happens after all levels are exhausted:

  • Repeat from Level 1 — loops back (recommended for critical services)
  • Stop escalating — no further notifications
  • Repeat N times — loops a set number of times, then stops

Tip: For production services, always enable repeat. An unacknowledged critical incident should never go silent.


Urgency Rules

Set urgency based on severity or time of day:

By Severity

  • Critical / Error → High urgency (phone + SMS)
  • Warning / Info → Low urgency (email + Slack)

By Time

  • Business hours (09:00–18:00) → Low urgency
  • Off-hours → High urgency

Combined

  • Critical during off-hours → High urgency, aggressive escalation
  • Warning at 2 PM → Low urgency, standard escalation

Fallback Behavior

If an incident goes unacknowledged through all escalation levels and all repeats:

  1. The incident is marked as Escalation Exhausted
  2. Optionally notify a fallback target (e.g., a Slack channel or email alias)
  3. The incident remains open and visible on the dashboard

Configure fallback at Escalation Policy → Advanced → Fallback Target.


Assigning to Services

Each service has exactly one escalation policy. Assign it at Service → Settings → Escalation Policy.

Multiple services can share the same policy. For example, all services owned by the Backend team can use the "Backend Escalation" policy.


Best Practices

  • Keep timeouts short for critical services — 2–5 minutes per level
  • Use schedules, not users — individual users go on vacation; schedules don't
  • Test your policies — trigger a test incident and verify the escalation chain works
  • Review quarterly — teams change, so should your policies

On this page