Kubernetes makes it easy to run applications at scale. It also makes it very easy to burn cloud budget at scale. Misconfigured resource requests, over-provisioned node pools, and underutilised reserved capacity are endemic in organisations that adopted Kubernetes quickly without a cost discipline practice to match. Here is how to fix it systematically.
Resource Requests and Limits: Get Them Right
Kubernetes schedules pods based on resource requests, not actual usage. If your requests are set too high, nodes fill up with "scheduled" capacity that is never actually consumed — you pay for CPU and memory that sits idle. Run your workloads for two weeks, collect Prometheus metrics on actual CPU and memory usage (p95 and p99), and set requests to the p75 value and limits to the p99. Revisit this quarterly as traffic patterns change.
Cluster Autoscaler and KEDA
Cluster Autoscaler adds and removes nodes based on pending pod scheduling requirements. It works well for baseline variability but can be slow to react to sudden traffic spikes. Combine it with KEDA (Kubernetes Event-Driven Autoscaling) for workloads that need to scale based on queue depth, HTTP request rate, or custom metrics. This combination keeps your node count tight during off-peak hours while guaranteeing capacity when demand surges.
The cheapest Kubernetes cluster is not the smallest one — it is the one where every scheduled pod is doing useful work at every moment.
Spot Instances and Preemptible VMs
For stateless, fault-tolerant workloads — batch jobs, CI runners, machine learning training tasks — Spot instances (AWS) or Preemptible VMs (GCP) can reduce compute costs by 60–90%. Configure dedicated Spot node pools, set appropriate pod disruption budgets, and implement graceful shutdown handlers in your applications. Use the Spot Instance Advisor (AWS) to choose instance types with low interruption frequency for your region.
Visibility with Kubecost
You cannot optimise what you cannot see. Kubecost provides per-namespace, per-deployment, and per-label cost attribution that makes it clear exactly which team and which service is consuming what budget. Integrate Kubecost into your engineering team's weekly review process. When engineers can see the cost of their deployment in the same PR that introduces a new service, cost awareness becomes part of the development culture rather than an afterthought.