Cluster Naming Convention

Cluster names should be predictable.

Why naming matters

At ten clusters, naming feels like a non-issue. At fifty, ad-hoc names like cluster-prod-2 and main-east become a productivity tax because engineers grep, lookup, and second-guess constantly. Naming encodes context (environment, region, purpose, ownership at a glance) and consistency aids automation (CI scripts that target prod-* match exactly the prod clusters).

The pattern

The pattern is {env}-{region}-{purpose}-{n}. Examples: prod-us-east-1-app-1, staging-eu-west-1-batch-1, dev-shared-1. env is the environment (prod, staging, qa, dev); region matches the cloud provider’s region label exactly; purpose is one word from a small enumerated set (app, batch, ml, data); number suffix allows capacity expansion.

Beyond the name: tags

Tags carry the metadata the name cannot fit. team, owner, contact, cost-center, expiry are queryable in cloud APIs; naming convention plus tagging convention is the full story (the name is the primary key, tags are the metadata); IaC enforces both with the Terraform module rejecting launches without proper name and required tags and CI failing the PR if missing.

Migration strategy

Existing clusters get renamed at next replacement; forcing immediate renames disrupts so treat the convention as the standard for new clusters. Document deviations explicitly (a cluster-old-prod-2 is allowed with a written exception so nobody is confused); quarterly drift report flags clusters that don’t match the convention and owners explain or rename.

Scaling considerations

At scale, the convention itself needs structure. Above 100 clusters, even good conventions hit limits and you add suffix for finer-grained purpose (prod-us-east-1-checkout-1 splits from prod-us-east-1-app-1); cluster discovery becomes its own service where an internal tool maps purposes to cluster names; federation considerations matter because multi-cluster control planes (Karmada, Anthos) use cluster names as identifiers.