Alert Integration Catalog
All ways alerts are sent. Cataloged.
Why catalog integrations
Most teams send alerts via 5+ channels: Slack, PagerDuty, Opsgenie, email, ticketing, custom webhooks. Without a catalog, no one knows what fires where, which alerts depend on each channel, or what wiring already exists when onboarding a new tool. The catalog is the inventory that makes the alert layer manageable.
- 5+ channels typical. Slack, PagerDuty, Opsgenie, email, ticketing, custom webhooks; without a catalog no one knows what fires where.
- Channel-break dependency. When PagerDuty or Slack has an incident, responders need to know which alerts depend on it.
- Onboarding awareness. When a new tool comes in, the catalog tells you what wiring already exists.
- Per-channel inventory. The catalog is the inventory that makes the alert layer manageable.
What goes in
The catalog stores channel metadata and route mappings. Channel name, owner team, transport type, secrets store reference, last-tested date, last-fire timestamp, fallback channel; for each channel, which alert routes target it and in which config block; cost per channel from vendor fees, event counts, and retention.
- Channel metadata. Name, owner team, transport, secrets reference, last-tested date, last-fire timestamp, fallback.
- Route mappings. Which alert routes target this channel, in which Alertmanager or Datadog config block.
- Cost per channel. Vendor fees, event counts, retention; supports cost optimisation review.
- Per-entry completeness. Missing fields flagged; supports complete inventory over time.
How to keep it current
Drift kills catalog usefulness. Generate from Alertmanager config and Datadog monitor APIs nightly so the catalog matches live config; synthetic test fires once per week per channel flag any channel that didn’t deliver in 7 days; pull owner team from Backstage or service catalog rather than duplicating ownership data.
- Nightly generation. Generated from Alertmanager config and Datadog monitor APIs; drift between catalog and live is a defect.
- Weekly synthetic test fires. Per channel; a channel that didn’t deliver a test fire in 7 days is flagged.
- Service catalog ownership. Owner team pulled from Backstage; don’t duplicate ownership data.
- Per-drift alert. Drift between catalog and live config alerts the platform team; supports the freshness discipline.
What it unlocks
The catalog unlocks three concrete benefits. Vendor migrations need every config block referencing the old provider; compliance audits ask which alerts notify which channels for which severities; incident reviews start with the question of whether the channel was up at all.
- Vendor migrations. Switching from Opsgenie to PagerDuty needs every config block referencing the old provider.
- Compliance audits. SOC2 controls ask which alerts notify which channels for which severities.
- Incident reviews. First question after a missed page is whether the channel was up.
- Per-use-case query. The catalog supports each use case via simple queries rather than ad-hoc digging.
Build it small first
Start small. A YAML file with 50 lines is fine; skip the SaaS catalog tool until you outgrow YAML; make ownership the mandatory field and skip channels nobody owns; plan to delete unused channels quarterly because channels with zero fires in 90 days are removal candidates.
- 50-line YAML first. Skip the SaaS catalog tool until you outgrow YAML; the simple form works.
- Ownership mandatory. Skip channels nobody owns; the catalog only tracks owned channels.
- Quarterly deletion. Channels with zero fires in 90 days are removal candidates; supports active curation.
- Per-quarter cleanup. The catalog stays small enough to use; supports continued discipline.