Alerts Practical By Samson Tanimawo, PhD Published Nov 9, 2025 4 min read

Pre-Paging Context Loading

Context loaded before the on-call sees the page.

The idea

By the time a human is paged, the system already has 90 seconds of context: what changed recently, which alerts also fired, which dashboards are relevant.

Pre-paging context loading attaches that data to the alert payload before the page goes out.

Saves 2 to 5 minutes of triage per incident. Compounded over a year, that's days of on-call time recovered.

What to pre-load

Recent deploys for the affected service (Argo CD events, GitHub Actions runs). 80% of incidents follow a deploy.

Related alerts within the last 15 minutes. Cluster the firing signals so the on-call sees the full picture.

Top affected endpoints, top affected customers, current load. All derivable from APM data.

How to load

Webhook from PagerDuty into a Lambda or Cloud Run job. Job queries Datadog, Argo CD, and the service catalog. Job posts back to the alert payload.

Latency target: 30 seconds. Slower than that and the human reaches the alert before the context arrives.

Cache aggressively. Most incidents share context within a 5-minute window; one query per service per minute is enough.

When it fails

Stale data. If the deploy lookup is 30 minutes behind, it's worse than no data; the on-call trusts wrong information.

Too much data. A page with 40 lines of context is unreadable on a phone. Cap at 5 facts.

Vendor outages. If Datadog is down, pre-loading fails. Fall back to a basic page; don't block on context.

Get started

Pick your top 3 services. Build a simple webhook that adds "recent deploys" to the alert payload.

Measure MTTA and MTTR before and after. Target a 30-second drop in median triage time.

Iterate per service. Adding context to all services at once is over-investment; pick by page volume.