Rightsizing compute means matching each instance to the capacity its workload actually uses, rather than the size someone picked at launch. It is the first move in any serious cost program because it converts immediately, carries low risk when done carefully, and shrinks the baseline before you buy any commitment. This guide is the step-by-step version of the compute rightsizing we run on every engagement.
It sits inside a larger cluster. For the full picture, start with our complete guide to cloud rightsizing and waste elimination, the pillar this article links up to. Compute rightsizing is the Cut step of our See, Cut, Lock, Run method, and it is where the first chunk of our 31 percent average reduction usually comes from.
Always rightsize before buying a reservation or savings plan. A commitment bought on an oversized fleet locks in the waste for one to three years. Clean the baseline first, then buy the rate against the smaller, true footprint.
Step 1: Gather real utilization, not point-in-time guesses
Rightsizing decisions made on a single afternoon's CPU reading are how outages happen. You need the shape of demand over time. Pull at least two weeks of data, and ideally a full month, so weekly cycles and month-end peaks are visible. For each instance, collect CPU, memory, network throughput and, where relevant, disk IO. Memory is the one teams forget, and it is the one most likely to cause trouble if you cut blind, because the default metrics on several clouds do not include it until you install an agent.
Every provider gives you a starting point. On AWS, Compute Optimizer analyzes CloudWatch metrics and recommends instance families and sizes. On Azure, Advisor surfaces VM rightsizing recommendations. On Google Cloud, the Recommender produces machine-type suggestions. On OCI, you read utilization through the Monitoring service and shape sizing. Treat all of these as inputs, not verdicts: they are useful for finding candidates, but they do not know your latency requirements or your peak-day behavior.
Step 2: Read the percentiles, not the average
Averages hide the spikes that matter. An instance averaging 15 percent CPU sounds idle, but if it hits 95 percent every weekday at 9am, downsizing it will cause a daily slowdown. Work from high percentiles instead. We size against the 95th percentile of utilization over the observation window, with headroom on top. The rule of thumb we use: target a size where the workload's p95 utilization lands around 60 to 70 percent of the new instance's capacity. That leaves room for normal variance and short spikes without paying for headroom you never touch.
This is also where you decide what to leave alone. Some workloads are genuinely peaky and latency-sensitive, and the right answer for them is autoscaling rather than a smaller fixed size. We cover that path in autoscaling done right.
Step 3: Choose the target, and consider a newer family
With a utilization profile in hand, pick the target size. Two moves often beat a simple downsize. First, switch to a newer instance generation: newer families usually deliver better price-performance, so a same-size move to a current generation can cut cost with no capacity loss. Second, on AWS and Google Cloud, evaluate Arm-based instances, which offer materially better price-performance for many general-purpose and compute workloads, provided your software runs on Arm.
Match the instance shape to the bottleneck, not just the total. A workload that is memory-bound on a balanced instance should move to a memory-optimized family, which is often both cheaper and faster for that job than simply buying a bigger balanced instance. Getting the shape right is most of the skill in rightsizing.
Want the rightsizing done for you?
Our cloud cost audit reads utilization across your whole estate, ranks every rightsizing opportunity by dollars, and hands you a safe, prioritized plan. On the performance model, you pay only from realized savings. No savings, no fee.
Book a cloud cost audit →Step 4: Resize safely, in increments
The safest resize is reversible and staged. Change one size step at a time rather than jumping several, deploy to a canary or a single instance in the group first, and watch the same metrics you sized against for at least a full business cycle before rolling out wider. Keep the rollback path obvious: note the original size and the change time so reverting is a one-line action if latency moves.
Sequence matters across an estate. Start with the largest dollar opportunities at the lowest risk, usually oversized non-production and stateless services, and leave databases and stateful systems for a dedicated, more careful pass, which we cover in rightsizing databases without hurting performance. Container workloads are sized differently again, at the pod level, in rightsizing Kubernetes requests and limits.
Step 5: Make it a loop, not a one-time event
Workloads drift. Traffic grows, code changes, and a perfectly sized instance becomes wrong again in a quarter. Rightsizing that runs once decays; rightsizing that runs monthly compounds. Build a recurring review into your operating rhythm so new candidates surface automatically. The systematic version of this is in how to build a continuous waste detection process, and the broader question of where to draw the line between cost and performance is in performance vs cost: finding the right balance.
| Signal | What it tells you | Action |
|---|---|---|
| p95 CPU and memory both low | Genuinely oversized | Downsize one step, canary first |
| Low average, high p95 spikes | Peaky workload | Autoscale rather than fix-size down |
| Memory high, CPU low | Wrong shape, not wrong size | Move to memory-optimized family |
| Current-gen, well utilized | Already right | Leave it, recheck next cycle |
Tool names and instance families above reflect provider offerings as of May 2026. Verify current generations and recommendations in each provider's console before resizing, as families and pricing change.
The Cloud Waste Audit Framework includes the utilization queries and the scoring model we use to rank rightsizing opportunities by dollars. It is the downloadable companion to this method.
The short version
Read at least two weeks of utilization, size against the 95th percentile with headroom, prefer a newer or better-shaped family over a blunt downsize, resize in reversible increments with a canary, and run the whole thing on a loop. Do it before you commit, and the savings carry through into every reservation you buy afterward. When you want it run across the whole estate at once, that is exactly what our rightsizing and waste elimination service delivers.