DaemonSet: When You Need One

DaemonSets run a pod per node. When that's right.

Yes

DaemonSets fit infrastructure agents that need a per-node footprint. Logging, monitoring, networking, and security agents all qualify because they operate on host-level data the workload pods cannot see.

Per-node logging agent. Fluentd, Vector, or Filebeat tail container logs from the node’s filesystem. One agent per node, no exceptions.
Per-node monitoring agent. node-exporter or Datadog agent collects host-level metrics (CPU, memory, disk, network). Workload pods cannot reach this layer.
Per-node CNI. Calico, Cilium, or AWS VPC CNI. Pod networking requires a per-node networking daemon by definition.
Per-node security agent. Runtime-security daemon (Falco, Sysdig). Per-node coverage is the requirement; DaemonSet matches the requirement.

No

DaemonSets are wrong for application workloads. They scale by node count, not by load, which is the opposite of what application workloads need.

Application workloads. Apps need load-driven scaling, not node-driven. DaemonSets scale exactly wrong for application traffic.
Wastes capacity. A DaemonSet on every node consumes resources whether the node has work or not. Wrong-primitive pick.
Use Deployment plus HPA. Application workloads belong on Deployments with Horizontal Pod Autoscalers. Load-driven scaling matches the access pattern.
Documented anti-pattern. Write “no DaemonSet for apps” into the cluster guidelines. New engineers get the rule before the mistake.

Design

DaemonSet design has its own discipline. The daemon must work on every node in the cluster, including the constrained ones (control plane, GPU, ARM, small instance types).

Tolerations. Toleration list covers tainted nodes (control plane, GPU, dedicated). Without the toleration, the daemon misses the nodes that need it most.
Resource requests. Sized to fit the smallest node in the cluster. Daemons must schedule everywhere; oversized requests cause unschedulable pods.
Priority class. Infrastructure agents use system-cluster-critical. Survives node pressure that evicts lower-priority pods first.
Rolling update strategy. OnDelete or RollingUpdate per daemon’s restart sensitivity. CNI and security agents need careful rollout.