PagerDuty Routing Rules: The Hard Cases
Routing alerts to the right team. The hard cases and the patterns.
The easy cases
Service to team, severity to escalation policy, business hours vs after hours. PagerDuty handles these natively with event rules.
Most teams stop here. The catalog is small; the rules are clear. Worth keeping it that way as long as possible.
Trouble starts when business and topology force exceptions.
Cross-team services
Service is owned by team A but rules into a feature owned by team B. A page on the feature should hit team B; a page on the platform should hit team A.
Use PagerDuty event orchestration with custom fields. The alert payload tags the feature; rules route accordingly.
Document the routing decision in the runbook. The on-call from team B should not have to ask team A's history to understand.
Time-of-day routing
Some teams have follow-the-sun coverage. Pages route to APAC overnight, EU during European day, US during American day.
PagerDuty schedules support this; event orchestration can override per service.
Test the boundaries. The 06:00 UTC handover is where routing bugs live; verify a synthetic page right at the boundary.
Vendor and third-party pages
AWS Health, Cloudflare incidents, GitHub status. These are signals, not pages, for most services.
Route to a Slack channel by default. Page only if the affected service is tier 1.
Use Statuspage's API to fan in vendor incidents to your alerting backbone, then apply your own routing logic.
Apply this quarter
Audit your event rules. Anything older than 6 months without a recent edit is suspect.
Document each non-trivial rule in a comment field. The next person to touch it needs the context.
Run a synthetic page test monthly. Routing breaks silently; only synthetic tests catch it.