Independent buyer side advisory · Anthropic onlyNew York · London
Token optimization

Cut the spend under your Claude contract.

A negotiated rate only sets your price per token. What you do with those tokens decides the invoice. We run an engineering led review of your Claude workloads and rebuild them around routing, caching, and batch, so the same product costs far less to run.

34%
Average reduction in Claude spend
$40M+
Anthropic commitments advised
100%
Anthropic focus, no other vendor
The four levers
Where the money actually leaks.
01

Model routing

Sending every request to Opus is the most common and most expensive mistake. Routing across Opus, Sonnet, and Haiku by task difficulty typically cuts aggregate spend 40 to 70 percent versus uniform Opus use.

02

Prompt caching

Stable context such as a system prompt, a tool schema, or a long document can be cached and reused at up to 90 percent off the input token rate. Most teams leave this lever untouched.

03

Batch processing

Any job that does not need an answer in the same second belongs in batch, which runs at 50 percent off. Classification, enrichment, scoring, and overnight pipelines are the usual candidates.

04

Token discipline

Long system prompts, bloated context, and silent retries multiply your bill. We trim the prompt surface, cap context, and remove the retries that quietly bill twice.

An engineering review a procurement leader can trust

We start by reading your real usage. That means the token logs, the model mix, the cache hit rate, and the share of traffic that runs synchronously when it does not need to. From there we build a model of where every dollar goes, by workload, so you can see which features carry the cost and which optimizations pay back first.

The output is not a slide that says use a cheaper model. It is a concrete plan: which calls move to Sonnet or Haiku, which context blocks become cache reads, which jobs shift to batch, and what each change is worth in committed spend terms. We write it so an engineering leader can implement it and a procurement leader can defend it.

Optimization is the leverage in the negotiation

Here is the part most vendors will not tell you. The lower your true cost to serve, the smaller the commitment you need to sign, and the more room you have to walk. A workload that has been routed, cached, and batched needs a smaller commit band, which means you carry less unused commitment risk and you negotiate from strength rather than dependence. Optimization and negotiation are the same project. We run both.

If you are weighing a new commitment or a renewal, start with the spend you can remove. Review our two pricing models, then tell us what you are optimizing and we will scope the review.

How we are paid
Two ways to engage, no downside.
Engagement A

Fixed Fee

From $18,000. Scope and price agreed up front.
  • One predictable number
  • Best when the scope is known
  • Negotiation and optimization included
Engagement B

Gainshare

A share of verified savings. Zero retainer.
  • You pay only from real savings
  • Find nothing, owe nothing
  • No risk to you, by design

Stop paying Opus prices for Haiku work.

We will model your Claude spend by workload and show you what comes off the invoice.

Get a Quote
Get started
Tell us what you are negotiating.

The Counteroffer

Weekly intelligence on Anthropic pricing moves and the buyer side counters that work.

Get a Quote · Book a Strategy Call · The Counteroffer · New York · London Not affiliated with Anthropic PBC. Independent buyer side advisory only.