Alerts Practical By Samson Tanimawo, PhD Published Nov 15, 2025 4 min read

Alert Language Clarity

Alerts written under stress. Clarity matters.

The alert title is the first message

Title format: `[severity] [service] [symptom]`. Example: `[SEV2] checkout-api: error rate > 2% for 5m`.

No internal jargon, no team-specific abbreviations. The alert may page someone outside the owning team.

Title length: under 80 characters. Mobile push notifications truncate; the symptom must come before the truncation.

The summary describes the impact

One sentence: "Customer-facing checkouts are failing for X% of users in Y region." Not: "PromQL expression XYZ exceeded 0.02."

Translate the technical condition into a user-impact statement. The on-call needs to know if it's a sev1 in 5 seconds.

If you cannot describe the impact in plain English, the alert is mis-scoped. Move it off paging.

The runbook link is mandatory

First link in every alert. Title: "Runbook: how to respond to checkout-api error rate."

Runbook opens with: how to confirm the alert is real, the first 3 commands to run, the escalation path.

If the runbook is a wiki page that has not been edited in 12 months, the alert is undeployable. Block the alert until the runbook is current.

Include the data the responder needs

Link to the dashboard pre-filtered to the right time range. Link to recent deploys for the affected service. Link to the trace search pre-filtered to errors.

Don't dump 50 labels into the alert. Pick the 5 that distinguish this alert from the others.

Include the PromQL or Datadog query that fired the alert. The on-call may want to tweak it during triage.

How to enforce the standard

A template in the alerting repo. Every alert PR uses the template; no template, no merge.

Linter checks: title format, runbook URL present, dashboard link present, summary length.

Quarterly audit: pull 10 random alerts, score them on clarity. Publish the worst three with the fix in the next sprint.