How to Use Spot Instances for Kubernetes Workloads

Running Kubernetes workloads on Spot instances trades a small amount of reliability for a large discount. Spot (called Spot VMs on Google Cloud, Spot Instances on AWS, and Spot Virtual Machines on Azure) is spare provider capacity sold cheaply on the understanding it can be reclaimed. Kubernetes is unusually well suited to it, because the scheduler already reschedules pods when a node disappears.

This how-to is part of our Kubernetes and container cost cluster. For the full picture, start with our complete guide to Kubernetes cost optimization, the pillar this piece links up to.

Step 1: decide which workloads qualify

Spot suits anything that tolerates a node vanishing on short notice: stateless web and API services with multiple replicas, batch and data-processing jobs, CI runners, dev and test environments, and queue consumers that retry. It does not suit single-replica stateful services, workloads with long ungraceful shutdowns, or anything that holds local state it cannot rebuild. The rule of thumb: if losing one pod for a minute is invisible to users, it is a Spot candidate.

Step 2: create a dedicated Spot node pool

Run Spot capacity in its own node pool, separate from the on-demand pool that carries critical workloads. On managed Kubernetes this is a node pool flagged as Spot or preemptible. Keep a smaller on-demand or committed pool for the control-sensitive workloads, and let the Spot pool carry the bulk of stateless compute. This split is the foundation of the whole pattern; it also pairs naturally with GKE Autopilot, Spot, and bin packing.

Step 3: steer pods with taints, tolerations, and affinity

Taint the Spot nodes so nothing lands there by accident, then add a matching toleration to the workloads you have cleared for Spot. Use node affinity or nodeSelector to prefer the Spot pool for those workloads and the on-demand pool for the rest. This keeps databases and ingress controllers off Spot while packing batch and stateless services onto it.

Step 4: handle disruption gracefully

The provider sends a termination signal before reclaiming a Spot node, typically a short warning window. Make sure workloads handle SIGTERM cleanly, set sensible terminationGracePeriodSeconds, and use Pod Disruption Budgets so the scheduler never drains too many replicas at once. Spread replicas across zones and node pools with topology spread constraints, so a single Spot reclamation never takes a whole service down.

Want Spot rolled out without the risk?

Our cost audit identifies which of your Kubernetes workloads are safe for Spot, builds the node pools and disruption handling, and measures the savings. On the performance model, you pay only from realized savings. No savings, no fee.

Book a cloud cost audit →

Step 5: blend Spot with on-demand and commitments

The strongest setups run a layered fleet: a committed-use or reserved base for the always-on critical workloads, on-demand for short bursts, and Spot for everything elastic and fault-tolerant. The autoscaler fills the Spot pool first and falls back to on-demand when Spot is unavailable. This blend is where the cluster bill drops the most while reliability holds; for the autoscaling side, see rightsizing node pools and instance types.

Workload	Spot fit	Why
Stateless web / API (multi-replica)	Strong	Reschedules instantly
Batch and data jobs	Strong	Retries on interruption
CI runners, dev/test	Strong	No user impact
Single-replica stateful services	Weak	No failover headroom
Databases, ingress controllers	Avoid	Reclamation risks outages

Spot product names and behavior above reflect the providers as of May 2026. Verify current interruption notice windows and Spot terms in the provider's documentation before relying on them, as they change.

Go deeper · free guide

The Kubernetes Cost Optimization Handbook includes the Spot node pool patterns and the disruption-handling manifests behind this article. It is the downloadable companion.

The short version

Pick fault-tolerant, multi-replica workloads, run them in a dedicated Spot node pool, steer pods with taints and tolerations, handle the termination signal gracefully with disruption budgets, and blend Spot with a committed base. The discount is large and the risk is manageable when the pattern is right. To allocate the savings back to teams, read Kubernetes cost allocation. When you want Spot rolled out safely across your clusters, that is what our rightsizing and waste elimination service delivers.

How to Use Spot Instances for Kubernetes Workloads

Step 1: decide which workloads qualify

Step 2: create a dedicated Spot node pool

Step 3: steer pods with taints, tolerations, and affinity

Step 4: handle disruption gracefully

Want Spot rolled out without the risk?

Step 5: blend Spot with on-demand and commitments

The short version

Cloud pricing moves. We tell you when it matters.