Saturation vs Utilization Alerts
Two types of resource alerts. Pick by what they catch.
Utilisation: trailing indicator
CPU at 80%, memory at 70%, disk at 60%. Numbers describing what has been used.
Useful for trends, capacity planning, dashboard panels. Less useful for alerts; by the time utilisation is high, the issue is already present.
Static thresholds rot. The right threshold for one workload is wrong for another. Alerts on utilisation alone produce noise.
Saturation: leading indicator
Queue depth growing, wait time increasing, throttle events firing. Numbers describing pressure.
Fires earlier than utilisation. CPU at 80% might be fine if queue depth is zero; queue depth growing predicts trouble.
Catches issues before user-visible failure. The monitoring foundation for predictive alerting.
Layer them in alerts
Saturation alerts page first. Queue depth above threshold for 5 minutes.
Utilisation alerts go to dashboards or business-hours notifications. Trends, not pages.
Together they catch different failure modes. Saturation for incipient overload; utilisation for sustained capacity issues.
Concrete examples
Database: connection pool wait time (saturation) versus connection count (utilisation).
Network: packet drop rate (saturation) versus bandwidth utilisation (utilisation).
Disk: I/O wait queue depth (saturation) versus disk space used (utilisation).