Independent buyer side advisory · Anthropic onlyNew York · London
Home · Blog · AI Cost Governance
AI Cost Governance

Tagging and attribution for AI cost.

Buyer side guide · 11 minute read

Every effort to govern Claude spend runs into the same wall sooner or later: you cannot manage what you cannot attribute. A finance team that receives a single large invoice from Anthropic, with no breakdown of which team, application, or feature drove it, is governing a number rather than a system, and a number cannot be optimized. The work that turns an opaque token bill into something governable is tagging and attribution, the discipline of labeling every call with enough context to trace the cost back to its source. It is unglamorous plumbing, but it is the foundation on which every other piece of AI cost governance rests, and the organizations that get it right early are the ones that never lose control of the bill.

The attribution problem in plain terms

The core problem is that a token invoice is aggregated. The provider tells you the total spend, perhaps split by model, but it does not and cannot tell you that a particular team's document summarization feature consumed a third of it, or that a single poorly bounded workload doubled month over month. That information lives in your application, in the decisions about which service made which call for which purpose, and if you do not capture it at the moment of the call, it is gone. You are left trying to reverse engineer responsibility from a total, which is impossible at any real scale.

This is precisely the problem cloud computing faced, and the solution there was tagging: attaching metadata to every resource so that cost could be sliced by team, project, environment, and purpose. Token spend needs the same approach, adapted to the fact that the unit of spend is a call rather than a running resource. The principle is identical. Capture the context at the point of consumption, carry it through to a place where cost can be aggregated against it, and the opaque total becomes a detailed map of where the money goes. Without that map, every governance conversation is a guess and every optimization is aimed blind.

What to tag: the dimensions that matter

Effective attribution depends on capturing the right dimensions, and a handful of them do most of the work. The first is team or owner, because accountability ultimately has to land with the people who control the spend, and you cannot allocate cost to a team you cannot identify on each call. The second is application or feature, because spend is most usefully understood at the level of what it produces, and knowing that a specific feature is the largest or fastest growing cost is what tells you where to optimize. The third is environment, separating production from development and testing, because spend that is acceptable in production may be pure waste in a test harness left running.

Beyond those, the model used on each call is essential, because model choice is the single largest cost lever and you cannot reason about routing opportunities without seeing the model split per workload. The token counts themselves, input, output, and cached, complete the picture, because they reveal where context is bloated, where output is unbounded, and where caching is or is not working. Some organizations add finer dimensions such as customer, request type, or business unit, and these can be valuable, but the core set of team, application, environment, model, and token detail is enough to make the bill governable. The discipline is to decide the tagging schema deliberately rather than letting each team invent its own, because attribution only works if the labels are consistent across the organization.

How to capture the tags

The mechanics of capturing tags are not difficult, but they have to be built in rather than bolted on. The cleanest approach is to route all calls through a shared internal layer, a gateway or wrapper, that requires the relevant metadata on every request and records it alongside the usage. When every call passes through a common path, you can enforce that the team, application, and environment are always present, attach them to a usage record, and guarantee consistency without relying on each engineer to remember. This central path is also where you can later enforce policy, apply routing logic, and measure caching, so the investment pays off well beyond attribution.

Where a shared layer is not yet in place, the next best option is to standardize the metadata at the application level and ensure each service records its own usage with the agreed tags. This is more fragile, because it depends on every team applying the schema correctly, but it is far better than nothing and can be a stepping stone toward a shared gateway. Whatever the mechanism, the aim is the same: every call leaves a trace that says who made it, for what, in which environment, with which model, and at what token cost. That trace is the raw material of attribution, and the quality of your cost governance is capped by the quality of that trace.

From tags to an attribution model

Captured tags are data, not yet insight. The next step is an attribution model that turns the tagged usage into allocated cost, applying the provider's pricing to the recorded token counts and rolling the result up by team, application, and the other dimensions. This is where the invoice total finally reconciles into a structure that finance and engineering can both use. The model should answer the questions that drive decisions: what does each team spend, which applications are the largest and fastest growing, how does the model mix break down per workload, and where is the spend trending. A good attribution model is the difference between knowing you spent a large sum and knowing exactly where it went and why.

The attribution model also has to handle the awkward edges honestly. Shared infrastructure, internal tooling that serves many teams, and experimental work that has no clear owner all need an allocation rule, and the rule should be agreed openly rather than left to default into whoever is easiest to charge. Cached tokens need careful treatment too, because the saving from caching should be visible and credited to the workload that achieved it, not hidden in an aggregate. The point of the model is fairness and clarity: every team should be able to see its allocation, understand how it was derived, and trust that it reflects real consumption. Attribution that teams do not trust will not drive the behavior change that governance depends on.

What attribution unlocks

Once attribution is in place, the rest of AI cost governance becomes possible. Showback and chargeback need it, because you cannot put spend in front of a team or bill it to their budget without knowing their share. Optimization needs it, because you cannot prioritize the largest and most wasteful workloads without seeing them ranked. Budgeting and forecasting need it, because a credible forecast is built from the trends of individual workloads, not from a single line that mixes everything together. And policy needs it, because rules about which workloads may use which models, or which require approval, can only be enforced and audited when usage is attributed. Attribution is the keystone that the rest of the practice rests on.

Attribution also changes behavior on its own, before any formal chargeback is applied. The simple act of showing a team its own spend, broken down by feature and model, surfaces waste that nobody had a reason to look for, because the cost was invisible. Teams discover the test workload left running, the unbounded output, the expensive model used where a cheaper one would do, and the bloated context that never needed to be sent. Visibility into attributed cost is frequently the cheapest optimization there is, because it costs only the plumbing and returns the savings of every obvious inefficiency it reveals.

The negotiation payoff

There is a commercial reason to invest in attribution that goes beyond day to day governance. A buyer who can attribute spend precisely walks into an Anthropic negotiation knowing exactly what they consume, where it comes from, and how it is trending, and that knowledge is leverage. It lets you size a commitment against real, understood usage rather than a guess, forecast with confidence, and identify the workloads you can optimize to shrink the commitment before you sign. A buyer who cannot attribute their own spend is negotiating against a vendor who understands consumption patterns better than they do, which is the weakest position to be in. Attribution closes that gap and puts the information advantage back on your side of the table.

Tagging and attribution are the unglamorous foundation that makes everything else in AI cost governance work, from showback to optimization to a stronger negotiating position. The organizations that build it early govern their Claude spend with precision, and the ones that skip it spend years fighting an opaque bill they can never quite explain. Our token optimization playbook covers the levers that attribution lets you target, with the math to size their impact, so the visibility you build translates directly into a smaller bill and a smaller commitment.

You cannot govern what you cannot attribute.

Download the token optimization playbook and see the exact levers we pull to cut aggregate Claude spend 40 to 70 percent.

Download the Playbook

The Counteroffer

Weekly intelligence on Anthropic pricing moves and the buyer side counters that work.

Get a Quote · Book a Strategy Call · The Counteroffer · New York · London Not affiliated with Anthropic PBC. Independent buyer side advisory only.