Status Pages vs Alerts: Coordination

Internal alerts and external status updates coordinate.

The distinction

Internal alerts page on-call to fix problems; status pages tell customers what’s happening. Different audiences, different language, different cadence. Updating one without the other creates either silent outages (alerts fire, status page green) or fake outages (status page red, no real impact). Wire both together but keep them separately authored.

When to update the status page

The status page lifecycle is well-defined. Customer-facing impact confirmed (more than 5% of users seeing degradation): “investigating”. Cause identified: “identified”. Mitigation deployed: “monitoring”. Confirmed clear: “resolved”. Updates every 30 minutes are fine; faster suggests customer comms should be a person, not a tool.

Automation boundaries

Automation has clear boundaries. Don’t auto-publish from internal alerts to the public status page (false positives become public crises); do auto-create draft incidents on the status page from major alerts so the on-call edits and publishes when ready; Statuspage, Atlassian Statuspage, and instatus all support draft-then-publish workflows.

Language

Status page text is customer-facing. “We’re investigating slow checkout” beats “checkout-api p99 latency above 200ms SLO”; always include impact (who is affected, what they see, what to do about it: refresh, retry, wait); update on a clock because a 10-minute silence reads as panic and even “still investigating” is better than nothing.

Apply this quarter

The application is concrete. Audit your last 5 customer-facing incidents to see if the status page reflected them in time (within 15 minutes is the bar); wire your top 3 alert categories into draft-creation on the status page (resist auto-publish); train on-call on status-page voice because the first incident is too late to learn.