A growth-stage SaaS company ran a $4.2M-a-year AWS estate on launch-day instance sizing and almost no committed spend. We took it to $2.8M, a 33 percent reduction, by rightsizing EC2 first and then laddering Savings Plans onto the smaller, true footprint.
A growth-stage SaaS platform ran its product, data pipeline and supporting services almost entirely on Amazon EC2, spending about $4.2M a year. The architecture had scaled faster than the cost discipline behind it. Most instances were sized for the demo-day peak and never revisited, several fleets ran a generation or two behind on hardware, non-production environments ran around the clock, and the company held only a thin layer of Reserved Instances bought ad hoc a year earlier. Finance saw the bill climb every quarter and had no model that explained why.
$4.2M annual AWS spend · EC2 sized for launch-day peaks · older instance generations · non-production running 24/7 · a thin, mistimed layer of Reserved Instances · no unit-cost view.
We ran the engagement in our standard order, See, Cut, Lock, Run, and held off on any commitment until the compute footprint was clean. Buying the rate first, on an oversized fleet, is the single most common mistake we unwind, and avoiding it is what made this saving both large and durable. The full version of this approach lives in our complete guide to AWS cost optimization.
We built a single FOCUS-normalized view from the AWS Cost and Usage Report and fixed tagging so every dollar mapped to a service, an environment and an owner. For the first time the team could see that production compute, not data transfer or storage, was the dominant line, and that roughly a third of it was non-production running outside business hours.
Using AWS Compute Optimizer readings cross-checked against two weeks of real utilization, we right-sized the production fleet against the 95th percentile of demand and moved several workloads to current-generation and Graviton families for better price-performance. In parallel we scheduled development and staging environments to run only during working hours. The detailed method is in our walkthrough on EC2 rightsizing with Compute Optimizer. Together, rightsizing and scheduling removed roughly half of the total saving before any rate change.
Only after the fleet was right-sized did we buy commitment, and we laddered it rather than buying one big block. We covered the steady production core with Compute Savings Plans for flexibility across families and regions, layered in some EC2 Instance Savings Plans where the workload was stable, and staggered the terms so they renew in tranches rather than all at once. The reasoning behind that mix is in Savings Plans vs Reserved Instances: which to buy and when. Because we committed against the smaller true footprint, every discounted dollar landed on real usage instead of waste.
We set AWS Budgets and Cost Anomaly Detection on each major service, and made tagging a deployment gate so untagged resources could not ship. That kept the recovered spend from drifting back as the platform kept growing.
Annual AWS spend fell from $4.2M to $2.8M, a 33 percent reduction, recovering about $1.4M a year. Roughly half came from EC2 rightsizing and non-production scheduling, and half from laddered Savings Plans applied on the clean baseline. Finance gained a unit-cost view that finally tracked with usage.
Sequence was the whole game. Had the SaaS team bought Savings Plans before rightsizing, as the prior ad-hoc Reserved Instances had been, the commitment would have locked onto oversized, older-generation instances for one to three years, and most of the 33 percent would never have materialized. Rightsize first, then commit, is the order that makes the discount stick to real demand and keeps the unit cost falling as the product grows.
We will read your EC2 utilization, find the waste, model the right Savings Plan ladder, and tell you the number. On the performance model, you pay only from realized savings. No savings, no fee.
Book an AWS cost audit →Figures reflect a real engagement outcome. Client identity withheld for confidentiality.