Escalation Policies

Define who gets notified, when, and what happens if they don't respond.


How Escalation Works

When an incident triggers, NotifyHero walks through your escalation policy level by level:

  1. Level 1 — notify immediately (e.g., on-call engineer)
  2. Wait for acknowledgment
  3. If no response after the timeout → Level 2 (e.g., team lead)
  4. Continue until someone acknowledges or all levels are exhausted
Level 1: On-Call Schedule "Backend Primary" → wait 5 min
Level 2: On-Call Schedule "Backend Backup" → wait 10 min
Level 3: User "vp-engineering@your-company.com" → wait 15 min
Level 4: Repeat from Level 1

Creating an Escalation Policy

Go to Settings → Escalation Policies → New Policy.

Each level requires:

  • Target — a schedule, a specific user, or a team
  • Timeout — minutes to wait before escalating (0 = notify simultaneously with next level)

Targets

| Target Type | Description | |-------------|-------------| | Schedule | Notify whoever is currently on-call | | User | Notify a specific person | | Team | Notify all members of a team |


Multi-Level Escalation

Add as many levels as needed. Common patterns:

Standard (3-level)

L1: Primary on-call → 5 min
L2: Backup on-call → 10 min
L3: Engineering manager → 15 min

Aggressive (for critical services)

L1: Primary on-call → 2 min
L2: Primary + Backup on-call → 3 min
L3: Entire team → 5 min
L4: VP Engineering → 10 min

Flat (small team)

L1: Entire team → 0 min (everyone notified immediately)

Repeat Behavior

At the bottom of the policy, choose what happens after all levels are exhausted:

  • Repeat from Level 1 — loops back (recommended for critical services)
  • Stop escalating — no further notifications
  • Repeat N times — loops a set number of times, then stops

Tip: For production services, always enable repeat. An unacknowledged critical incident should never go silent.


Urgency Rules

Set urgency based on severity or time of day:

By Severity

  • Critical / Error → High urgency (phone + SMS)
  • Warning / Info → Low urgency (email + Slack)

By Time

  • Business hours (09:00–18:00) → Low urgency
  • Off-hours → High urgency

Combined

  • Critical at 3 AM → High urgency, aggressive escalation
  • Warning at 2 PM → Low urgency, standard escalation

Fallback Behavior

If an incident goes unacknowledged through all escalation levels and all repeats:

  1. The incident is marked as Escalation Exhausted
  2. Optionally notify a fallback target (e.g., a Slack channel or email alias)
  3. The incident remains open and visible on the dashboard

Configure fallback at Escalation Policy → Advanced → Fallback Target.


Assigning to Services

Each service has exactly one escalation policy. Assign it at Service → Settings → Escalation Policy.

Multiple services can share the same policy. For example, all services owned by the Backend team can use the "Backend Escalation" policy.


Best Practices

  • Keep timeouts short for critical services — 2–5 minutes per level
  • Use schedules, not users — individual users go on vacation; schedules don't
  • Test your policies — trigger a test incident and verify the escalation chain works
  • Review quarterly — teams change, so should your policies