Independent buyer side advisory · Anthropic onlyNew York · London
Home · Blog · AI Cost Governance
AI Cost Governance

Policy for approving new AI workloads.

Buyer side guide · 9 minute read

The place where Claude spend most often slips out of control is not the workloads you already run, it is the new ones nobody reviewed. A team ships a feature that calls the API, it works, usage grows, and three months later it is a meaningful line on the bill that no one sized, optimized, or approved. Multiply that across an organization where any team can integrate the model, and the aggregate bill drifts upward through a hundred small, unexamined decisions. The fix is not to slow teams down with heavy bureaucracy, it is a lightweight approval policy for new AI workloads that catches the cost and risk questions early, while they are still cheap to answer. This is how to design one that controls spend without becoming the thing engineers route around.

Why new workloads are the leak

Existing workloads are visible. They appear in your attribution, they have owners, and they get optimized because someone is watching them. New workloads start invisible. They are born inside a team's normal development, they do not show up as a budget request because the marginal API call seems trivial, and by the time they are large enough to notice, they have already shipped with whatever model, context size, and architecture the team happened to choose under deadline. The cost was set at the moment of design, and design is exactly the moment no governance was present. That is why new workloads are the leak: the most consequential decisions about a workload's cost are made before anyone with a cost mandate ever sees it.

An approval policy works because it inserts a small checkpoint at that decision moment. It does not have to be slow or heavy. It has to be present, early, and focused on the few questions that actually determine whether a workload will be efficient and safe. Catching a workload at design and asking whether it is using the right model, caching what it should, and handling sensitive data correctly costs a short conversation. Catching the same questions after the workload has scaled costs a refactor, a renegotiation, or a compliance scramble. The policy trades a few minutes early for large savings later, which is the best trade in cost governance.

The questions a good policy asks

A useful approval policy is a short, consistent set of questions, not a lengthy form. The first is purpose and expected volume: what does the workload do, and roughly how many calls and how much token volume will it generate at expected scale. This forces a forecast, however rough, which is the single most clarifying thing a team can produce, because a workload nobody has sized is a workload nobody can govern. The second is model choice: which model does it use, and has the team confirmed that a cheaper model would not handle the work well enough. The default of reaching for the most capable model is the most common source of avoidable cost, and simply asking the question redirects a large share of workloads to the right tier.

The third is the optimization check: does the workload reuse context that could be cached, and could any part of it run in batch rather than real time. Caching shared context can reduce the cost of that portion by up to ninety percent, and batch runs asynchronous work at roughly half the real time cost, so a workload that ignores both is leaving its largest savings untouched. The fourth is data and compliance: what data flows to the model, what classification it carries, and whether that raises any handling, residency, or retention requirement. The fifth is ownership: which team owns the cost and will be accountable for it. Five questions, asked consistently, catch nearly all of the cost and risk that new workloads otherwise smuggle in.

Tiering the policy so it stays light

The fastest way to kill an approval policy is to apply the same weight to a tiny experiment and a major production feature. Teams will rightly resent a heavy review for a workload that will cost very little, and they will route around the process, which leaves you worse off than no policy at all. The answer is to tier the policy by expected impact. A small, low volume, low risk workload should clear with a quick self service check, perhaps a lightweight record that captures the five questions and requires no sign off. A large, high volume, or sensitive workload warrants a real review with someone from cost governance and, where data is involved, security. The threshold between tiers should be set so that the heavy path applies only where the stakes justify it.

This tiering keeps the policy proportionate, which is what makes it survive. Engineers accept governance that matches the stakes and reject governance that does not, so a policy that gets out of the way for small things and engages seriously for large ones earns the cooperation it needs. The goal is not to review everything equally, it is to make sure nothing large or risky ships without the cost and data questions being answered, while letting small experiments move freely. A policy that achieves that is one teams will actually follow rather than evade.

Connecting policy to the commitment

An approval policy is not only an operational control, it protects the commercial position you negotiated with Anthropic. Your commitment was sized against a forecast, and that forecast assumed a certain trajectory of workloads. Every new workload that ships unreviewed and unoptimized pushes actual consumption away from the forecast, either toward overage you did not plan for or toward a renewal where usage has ballooned and the vendor holds the leverage. A policy that ensures new workloads are sized and optimized before they ship keeps actual consumption aligned with the commitment, which is what keeps the deal you negotiated intact across its term.

This is the link between governance and negotiation that buyers most often miss. The commitment is not a one time event that ends at signing. It is a number you have to live within, and living within it depends on controlling the workloads that consume against it. An organization with no approval policy is one where consumption drifts unpredictably, the commitment is either overshot or wildly mis sized at renewal, and every negotiation starts from a position of not understanding your own demand. An organization with a working policy brings disciplined, optimized, well forecast demand to every commercial conversation, which is exactly the position of strength a buyer wants. The policy you run day to day shapes the deal you sign at renewal.

Making it real without the bureaucracy

The practical path to a working policy is to start small and embed it where teams already work. Build the five questions into the design or launch process teams already follow rather than creating a separate gate they have to remember. Make the small tier genuinely self service so it adds seconds, not days. Reserve human review for the workloads whose size or sensitivity earns it. And close the loop with attribution, so that approved workloads are tagged and tracked, and the policy can be checked against reality rather than taken on trust. A policy that is embedded, tiered, and connected to attribution becomes part of how the organization works rather than an obstacle bolted on top of it.

Most organizations discover they need this policy only after a new workload has already surprised them on the bill or in a renewal. If you are sizing a commitment, defending a renewal, or simply watching your Claude spend climb faster than you can explain, an approval policy for new workloads is one of the highest leverage controls you can put in place, and it pairs directly with the commitment terms we help buyers negotiate. Get a quote and we will help you connect the governance you run internally to the commercial deal you sign with Anthropic, so the two reinforce each other instead of drifting apart. Our token optimization playbook covers the optimization checks the policy should enforce, so every new workload ships efficient from the start.

Stop the bill from growing in the dark.

Get a quote and we will connect your internal approval policy to the Anthropic commitment it is meant to protect.

Get a Quote

The Counteroffer

Weekly intelligence on Anthropic pricing moves and the buyer side counters that work.

Get a Quote · Book a Strategy Call · The Counteroffer · New York · London Not affiliated with Anthropic PBC. Independent buyer side advisory only.