Certificate Rotation Automation

cert-manager, ACM.

Overview

Certificate rotation automation replaces TLS certificates before they expire without humans in the loop. Manual rotation produces predictable incidents (someone forgot, the on-call discovered it at 3 AM); automation removes the bottleneck. Five primitives carry most operational use: cert-manager for Kubernetes, ACM for AWS-native, auto-renewal before expiry, in-place service reload, audit trail.

The approach

Platform-native automation (cert-manager for K8s, ACM for AWS, Vault PKI for internal mesh), expiration monitoring as a backstop for automation failure, non-prod testing before production rotation. The discipline is treating cert hygiene as a property of the platform rather than a per-cert calendar reminder.

Why this compounds

Each automated cert removes a class of recurring incidents. Engineering time stops being spent on renewal calendar work; short-lived certs (Let's Encrypt 90-day, Vault PKI hours) become practical because automation handles the rotation. By year two, "cert expired" is no longer a category of postmortem.