When the live state of infrastructure diverges from its declared source-of-truth, the silent precursor to most surprise outages.
Configuration drift is the divergence over time between what a system's declared configuration says (Terraform state, Helm values, Ansible inventory) and what is actually running. Drift accumulates because of manual edits during incident response, cloud-console clicks, autoscaling decisions, expired secrets that someone rotated by hand, and rollouts that partially completed. Detection tools (terraform plan, drift detection in AWS Config, k8s drift detection) compare live state to declared state and flag the deltas.
Drift is the most common reason a 'rebuild from Git' actually fails when you need it most: the production environment relies on changes that aren't in Git. A culture of detecting drift weekly and either (a) reverting the live system to match Git or (b) committing the change to Git is the operational discipline that makes disaster recovery actually work.
See the part of the platform that handles configuration drift in production.