Monitoring the Monitor: Self-Observability
Your monitoring stack can fail. The patterns for catching it: heartbeats, dead-man's switches, cross-system probes.
Heartbeat
Each component emits a heartbeat metric. Missing heartbeat = component is down.
Heartbeat is the floor; without it, missing alerts can mean 'all is well' or 'monitoring is broken.'
Dead-man's switch
Schedule an alert that fires if a 'I am alive' message stops arriving.
Inverse of normal alerts. Catches failures of the alerting system itself.
Cross-system probes
Send synthetic test events; verify they reach the monitoring backend within N seconds.
End-to-end check. Catches partial failures (data lands but is delayed).