How Big Should Your Anthropic Committed Spend Be

Almost every meaningful Anthropic API deal turns on a single number: the committed spend you agree to over the term. That number sets the rate you pay per token, because larger commitments move you into better bands. It also sets your downside, because on most agreements the commitment you do not use simply disappears at the end of the period. Get the number right and you capture a real discount with no waste. Get it wrong and you either overpay through an inflated commit or you leave the discount on the table by committing too little. This is how we size it on the buyer side.

Why the commit number carries so much weight

Anthropic prices the API on consumption, per token, by model. Published rates exist, but enterprise buyers do not pay published rates. Instead you commit to a level of spend over a term, usually a year, and in exchange you receive a discount that grows as the commitment grows. The commit bands are the rungs on that ladder. As you move from a smaller annual commitment toward the larger ones, the negotiated rate improves. So the instinct to commit big is not irrational. The problem is the other side of the contract: unused commitment treatment. On a standard agreement, if you commit to a number and consume less, the shortfall is not refunded and it does not roll forward. You paid for capacity you never touched. The commit number therefore sits between two opposing forces, and the right answer balances them rather than maximizing either one.

Start with consumption, not with the discount

The most common mistake is to start from the discount you want and back into a commit that earns it. That gets the logic backward. Start from honest consumption. Pull your actual usage if you have any history, broken down by model, by workload, and by month. If you are pre launch, build a bottom up model: estimate the number of calls each feature will make, the average input and output tokens per call, and the model each call will run on. Multiply through to a monthly token figure, then to a monthly dollar figure at the rates you expect. Only once you have a defensible monthly consumption figure should you start thinking about which band it places you in.

Separate the floor from the forecast

Your usage has two important numbers, and buyers often conflate them. The first is your floor, the level of consumption you are confident you will reach no matter what. The second is your forecast, the level you expect to reach if things go to plan. The commit should be anchored much closer to the floor than to the optimistic forecast, because the floor is the spend you are certain to incur and therefore certain to consume. Committing at the forecast assumes everything goes right, and product timelines rarely do. If your floor is one number and your stretch case is double that, commit near the floor and negotiate the right to grow into the higher band as your usage proves out, rather than committing to the stretch case on day one.

Model the spend after optimization, not before

Here is the step that most internal teams skip, and it is the one that changes the number the most. Do not size your commit against your current, unoptimized token spend. Size it against what your spend will be once the obvious savings are in place. Routing across Opus, Sonnet, and Haiku so that each task runs on the cheapest model that handles it well can cut aggregate spend dramatically. Prompt caching on stable context can take up to ninety percent off the cost of the cached portion. Moving asynchronous jobs to batch takes fifty percent off those jobs. Together these levers commonly cut aggregate spend by forty to seventy percent versus running everything on Opus with no caching. If you commit against your pre optimization spend, you will commit to a number you can no longer reach once you optimize, and the gap becomes pure waste. Optimize first, or at least model the optimized state, then size the commit to that lower, truer number.

Build in the right buffer, in the right direction

Buyers ask how much buffer to add. The answer is that the buffer should protect you against the expensive failure, not the cheap one. The expensive failure is overcommitting and losing unused spend. The cheap failure is undercommitting and paying overage on the excess. As long as you have negotiated the overage to be charged at your committed rate rather than at a punitive list price, undercommitting costs you almost nothing: you simply pay the same rate on the tokens above your commit. That asymmetry should shape the buffer. Commit conservatively, protect the overage rate, and let growth flow over the top at the rate you already negotiated. A buyer who does this captures the band discount on the spend they are sure of, and pays a fair rate on the upside, with no dead capacity.

Use a ramp when your usage is still climbing

If your usage is growing through the year, a flat annual commit forces you to average a rising curve into one number, which means you overcommit early and undercommit late. A ramped commit fixes this. You agree to a lower commitment in the early periods and a higher one later, matching the curve of your adoption. This lets you capture the volume discount tied to your end state without paying for that level of consumption before you reach it. Anthropic account teams can structure ramps, and asking for one is normal rather than aggressive. The key is to tie each step of the ramp to a realistic date and to keep the early steps near your floor.

The questions that set the right number

What is our consumption floor, the spend we will reach no matter what happens to the roadmap?
What will that spend look like after routing, caching, and batch are in place, not before?
What is the overage rate above the commit, and is it the committed rate or a higher list price?
How is unused commitment treated at period end, and does any of it roll forward?
Is our usage flat or climbing, and would a ramped commit fit the curve better than a flat one?

A simple worked example

Suppose your honest, optimized floor is around one hundred thousand dollars a month, and your stretch forecast is double that if a new product takes off mid year. The wrong move is to commit to the stretch case across twelve months, because if the product slips you lose the difference. The right move is to commit near the floor, secure the band that level earns, protect the overage rate so the upside is charged fairly, and structure a ramp so the commitment steps up only when the new product actually ships. You capture the discount on the spend you are sure of, you pay a fair rate on the growth, and you carry no dead commitment. That is the whole game.

What good looks like

A well sized commit has three properties. It sits close to the consumption you are confident you will reach, measured after optimization. It is paired with an overage rate equal to your committed rate, so growth is never penalized. And it reflects the shape of your usage over the term, through a ramp if your usage is climbing. A buyer who signs a commit with those three properties has earned the volume discount without taking on the risk that usually rides alongside it. That is the difference between a commit that works for you and one that quietly works against you.

The data you need before you can size anything

A commit sized on intuition is a commit sized wrong. Before you can pick a number, you need a small set of facts in front of you, and gathering them is the most valuable work in the whole exercise. You want at least two months of usage broken down by model, so you can see how much of your spend is Opus versus Sonnet versus Haiku and where the mix could shift cheaper. You want consumption by workload, so you can tell which features drive the bill and which are negligible. You want the month to month variance, because a lumpy pattern argues for an annual measurement period rather than a monthly one. And you want a clear view of which workloads are still growing and which have plateaued. With those facts, the commit number almost picks itself. Without them, you are guessing, and the seller is the only party in the room who is not.

If you are pre launch and have no history, build the bottom up model and then treat it with suspicion. New products almost never ramp on schedule, so the honest version of a pre launch forecast has a wide range, and the commit should sit near the bottom of that range rather than the middle. You can always grow into a higher band as the usage proves out. You cannot easily claw back a commitment you oversized on launch optimism.

The sizing mistakes we see most often

Three mistakes account for most oversized commits. The first is committing to the roadmap rather than the run rate, where the number reflects the product that is supposed to launch instead of the usage that exists. The second is committing against the unoptimized bill, where the number is built on Opus heavy, uncached, real time spend that will fall sharply once the obvious savings land. The third is reaching for a band, where the commit is nudged upward to cross a discount threshold, and the spend added to reach it is spend that will never be used. Each mistake feels prudent in the moment and each one converts directly into dead capacity. The defense against all three is the same: anchor to the optimized floor, prove the floor with data, and let everything above it flow over as overage at your protected rate.

How big should your Anthropic committed spend be.