Alerts Practical By Samson Tanimawo, PhD Published Apr 17, 2026 4 min read

Anomaly Detection vs Static Thresholds

Two alert approaches. Decision by workload pattern.

Where static thresholds win

Static thresholds win on contractual numbers and stable workloads. SLA values like 99.9% availability or 200ms p99 latency are contractual; the threshold is the contract. Stable workloads where traffic is predictable within 20% have static thresholds catch real outliers cheaply, and the on-call understands the trigger without reading ML output.

Where anomaly detection wins

Anomaly detection wins on seasonal traffic, per-tenant variation, and cardinality-heavy metrics. E-commerce during holidays, payroll on the 15th, and weekday-vs-weekend patterns all have shape that static thresholds cannot capture; per-tenant variation needs per-series baselines that anomaly detection produces automatically.

The trade-off

Anomaly detection comes with costs. Higher default false-positive rate without careful sensitivity tuning; harder to debug because “why did this fire?” needs model output, not just a number; and tooling lock-in because Datadog Watchdog, Prometheus MAD, and GCP MQL are not interchangeable.

Hybrid is usually right

Hybrid alerting is usually the right answer. Static thresholds on contractual SLAs and known dangerous values (disk > 90%, queue > 10k); anomaly detection on traffic-shape metrics where the normal range varies hourly or seasonally; the static alerts that work do not need replacement.

How to pick per metric

The pick is metric-specific. Known dangerous value (SLA, capacity limit) is a static threshold; strong seasonality is anomaly detection with a seasonality model; per-tenant variation is anomaly detection with per-tenant baselines. Match the technique to the metric’s shape.