Kubernetes Advanced By Samson Tanimawo, PhD Published Aug 25, 2026 10 min read

Cluster Federation vs Virtual Kubelet

Two ways to run workloads across clusters and clouds, one tries to make N clusters look like 1, the other makes one cluster look like N. The hub-and-spoke Argo pattern that wins in practice.

Why this problem exists

Most companies end up running multiple Kubernetes clusters whether they planned to or not. Different regions for latency. Different clouds for sovereignty or contract reasons. Edge clusters for low-latency endpoints. A central cluster plus a bunch of remote ones for partner integrations. By the time you have five, you have an operations problem.

The naive answer is to manage them all with kubectl, switch contexts, repeat the command. That works for two clusters, breaks at five, fails at twenty. The cross-cluster operations problem is the one federation and virtual kubelet are trying to solve, with very different approaches.

The simpler framing. Federation tries to give you one logical cluster on top of N physical clusters, deploy once, propagate everywhere. Virtual kubelet does the inverse: it makes one cluster look like it has nodes that aren’t real nodes, deploy a pod, the “node” is actually another cluster or another runtime entirely. Two opposite directions; both useful in different scenarios.

Cluster Federation (KubeFed v2 and Karmada)

The federation pattern. A control plane (host cluster) that knows about N member clusters. You apply a FederatedDeployment or a PropagationPolicy; the controller fans out the underlying resources to the member clusters according to the policy.

The strengths. One place to define the workload; many places to run it. Policies for placement (run only in EU clusters), weighted distribution (70% in cluster A, 30% in cluster B), failover (move to cluster B if cluster A is unavailable). The control plane gives you a global view.

The weak spot. Federation is a heavyweight pattern. The host cluster is a single point of failure for orchestration; member clusters drift if the host loses connection; debugging cross-cluster issues is hard because the actual state is split. KubeFed v2 was deprecated; Karmada has taken its place but is still a niche choice.

The reality. Most teams that try federation discover it’s too much machinery for the problem. The use case where it shines is “run the same workload identically across many clusters”, a SaaS provider with one application per region, a CDN-style fleet. For most companies, that’s not the actual problem.

Virtual Kubelet

Virtual Kubelet is a different shape. It’s a node agent that registers as a Kubernetes node but doesn’t actually run pods on a host. Instead, it forwards pod creation to a different system, a serverless runtime (AWS Fargate, Azure Container Instances), another cluster, an edge device, anything.

The strengths. Your existing cluster gets “burst capacity” that lives elsewhere. Pods scheduled on the virtual node land in Fargate (or wherever); from kubectl’s perspective they’re just pods. Useful for batch workloads, cost optimisation (cheap edge runtime for low-priority work), edge deployments.

The weak spot. The pod abstraction is leaky. Pods running “on” the virtual kubelet have constraints based on the underlying runtime, no host networking, limited volume types, different startup latency. Network policies often don’t apply. Service discovery may need extra work. The virtual node is convenient but not transparent.

The reality. Virtual kubelet is great for specific use cases: serverless burst capacity, edge offload, batch-to-spot-instances. It’s a poor solution for “run my whole stack across regions”, the abstraction breaks down when you need real cluster features on the “other side.”

The hub-and-spoke Argo pattern

The pattern that’s won in practice for most multi-cluster deployments isn’t federation or virtual kubelet, it’s GitOps with a central Argo CD or Flux instance.

The shape. One hub cluster runs Argo CD. Argo CD is registered with N spoke clusters. The git repo has one directory per cluster (or a templating layer that generates them); Argo CD applies the right manifests to the right clusters. Each spoke cluster is independent, it has its own control plane, its own etcd, its own everything, but the desired state is centrally defined.

The strengths. Each spoke cluster is independent, so a hub failure doesn’t break the spokes. The desired state is git, which is auditable and version-controlled. Drift detection per cluster catches the manual edit. Adding a cluster is “register with Argo CD; add a directory in the repo.”

The weak spot. There’s no “global view” the way federation gives you. If you want to know “what version of app X is running in all clusters”, you query each cluster (or query Argo CD’s aggregated status). You don’t have a federated kubectl get deployments; you have N independent ones.

The reality. The lack of a global view is a feature, not a bug. Each cluster is its own failure domain; trying to make them act as one introduces coupling that breaks under load. The hub-and-spoke pattern keeps clusters independent at runtime; the only shared thing is the desired-state git repo, which is appropriately decoupled.

When to use which

Use cluster federation (Karmada) if you’re running an identical workload across many clusters and you genuinely want propagation policy as a runtime concept (e.g. “rebalance traffic across clusters based on capacity”). Mostly: SaaS providers with regional clusters running one app.

Use virtual kubelet for specific bursts: serverless capacity (Fargate), edge offload (k3s on a remote box that registers as a node back to the central cluster), batch-to-spot-instance pricing optimisation. Niche, valuable when it fits.

Use hub-and-spoke GitOps for everything else, which is most of the time. Multi-region, multi-cloud, multi-environment, multi-team-clusters all fit this pattern. Independent clusters; central desired-state; per-cluster drift detection.

Antipatterns

Trying to federate stateful workloads. A federated database doesn’t make sense; you end up with worse versions of multi-region database problems plus federation complexity. Run stateful workloads per-cluster; let the application handle multi-cluster.

Running federation as a single point of failure. If your federation control plane goes down and your spokes can’t take new deployments, you’ve added a SPOF. Either make the control plane HA, or use the hub-and-spoke pattern where spokes are independent.

Mixing virtual kubelet and federation. The abstractions don’t compose well. Pick one mental model; stick with it.

Treating clusters as cattle. Each cluster has its own version, its own quirks, its own state. Cluster lifecycle is real work; pretending otherwise leads to upgrade-pain and incident surprises.

What to do this week

Three moves. (1) Inventory your clusters, how many, what’s in each, who owns each. Most multi-cluster teams discover the count is higher than they thought. (2) For each cluster, identify which pattern it fits: hub-and-spoke (most), virtual-kubelet-burst (some), federation (rarely). The mismatch between current and ideal is your work backlog. (3) If you don’t have GitOps yet, start with one Argo CD instance pointing at one cluster from one repo. The pattern compounds; the marginal cost of adding clusters drops fast once the hub exists.