Seasonality in AI consumption planning.

Buyer side guide · 11 minute read · By Fredrik Filipsson · Published May 29, 2026 · Updated June 12, 2026

Most consumption plans treat a year of Claude usage as a single rising line, and most of them are wrong because of it. Real usage breathes. It climbs through a product launch, plateaus over a slow summer, spikes when a customer facing feature hits its busy quarter, and dips when your own teams are on holiday and shipping less. When you size a committed spend against a flat or smoothly rising forecast, you are sizing it against an average that almost never occurs in any given month. The peaks push you into overage at the worst possible time, and the troughs leave committed dollars expiring unused. This guide explains how to model seasonality into your plan so the commitment you sign fits the shape of your year, not just its total.

Why seasonality is invisible until it hurts

Seasonality hides in two places that most buyers do not look. The first is the demand side: the business that consumes Claude has its own calendar. A retailer runs hot in the fourth quarter, a tax product peaks in spring, a travel app surges before summer, an education tool empties out when schools close. If Claude powers a feature that customers use, the feature inherits the customer calendar, and so does your token bill. The second is the supply side, meaning your own engineering and operations cadence. Code review and agent usage rise during heavy build periods and fall during freezes. Batch and evaluation runs cluster around release milestones. Internal seat usage drops when half the company is out.

Neither of these shows up in an annual total, which is exactly why annual totals are dangerous to commit against. A commitment is consumed monthly, not annually. If your year totals twelve million dollars of usage but four of those months run at double the average and three run at half, a commitment sized to the smooth average will be wrong in seven months out of twelve. The total can be perfectly accurate and the monthly fit still poor. Seasonality planning is the discipline of forecasting the shape, not just the size.

Map the demand calendar to your token calendar

Start by listing every Claude powered workload that touches a seasonal driver. For each one, write down what the driver is and when it peaks and troughs. A support summarization feature peaks when ticket volume peaks. A document processing pipeline peaks at quarter end or year end close. A consumer assistant peaks on the same days the consumer product peaks. Once you have the driver, you can borrow the seasonality curve that already exists in the business, because someone in finance or operations has already modeled when tickets, transactions, or active users rise and fall. You do not need to invent the curve. You need to attach your token consumption to it.

Then translate the curve into tokens. A workload that processes one token per unit of demand will track the demand curve closely. A workload with a fixed internal component plus a variable customer component will track it only partly. Separating the fixed base from the seasonal variable portion is the key move, because the base is what you can safely commit against month after month, and the variable portion is what you need flexibility to absorb. A plan that distinguishes the two gives you a floor you can commit to and a swing you can structure around.

Layer in your own operational cadence

The supply side calendar matters just as much and is easier to forget because it feels like noise rather than seasonality. Engineering usage of Claude, through coding agents, code review, and evaluation runs, follows the release calendar. A team driving toward a major launch will burn far more tokens in the weeks before it than in the quiet weeks after. Code freezes flatten usage to near zero. Hiring waves add seats and usage in steps. End of year holidays drop internal consumption across the board for several weeks.

These patterns are predictable if you look at a full year of history rather than a recent sample. The mistake is to forecast from a three month window that happened to fall during a build heavy period, which projects a high run rate into months that will actually be quiet, or to forecast from a quiet window and undershoot the busy quarters. Pull at least a year of usage data, mark the launches, freezes, and holidays against it, and the operational seasonality becomes obvious. Once it is visible, you can plan around it instead of being surprised by it.

Build the monthly shape, then the bands

With the demand calendar and the operational cadence both mapped, build a month by month consumption profile rather than a single annual figure. For each month, sum the fixed base across workloads and add the seasonal variable portion driven by that month's position in each curve. The result is a twelve month shape with identifiable peaks and troughs, and that shape is what you size the commitment against. You are no longer asking how much we will spend this year. You are asking what the highest sustained month looks like, what the lowest looks like, and how wide the gap is.

Then express each month as a range, not a point, because the seasonal curves themselves carry uncertainty. A conservative case, an expected case, and an aggressive case for each month give you a band, and the width of that band in the peak months is what tells you how much overage protection you need. The width in the trough months tells you how much unused commitment risk you carry. A plan built this way turns seasonality from a hidden hazard into a set of explicit numbers you can negotiate around.

Structure the commitment to fit the shape

Seasonality is not just a forecasting problem. It is a contract structure problem, and the structure is where a buyer side advisor earns the fee. A flat monthly commitment fits a flat usage curve and fights a seasonal one. If your usage triples in the fourth quarter and halves in the summer, a flat commitment forces you to either size to the peak, wasting committed dollars for eight months, or size to the average and pay overage during the peak. Neither is the right answer, and both are avoidable.

The better structures absorb the shape directly. An annual commitment measured against annual consumption, rather than a monthly minimum, lets the peaks and troughs offset each other inside the year so a strong quarter covers a weak one. A ramped commitment that steps up over the term fits a business that grows into its seasonality. A protected overage rate means that when the peak does push past the commitment, the excess is billed at your committed rate rather than at list, which removes the penalty from the busy season. And negotiated unused commitment treatment means a soft trough does not strand spend permanently. The forecast tells you which of these you need. The negotiation gets them into the contract.

Pair seasonality planning with optimization

The most overlooked lever is that optimization changes the shape, not just the size, of the curve. The peaks are where optimization pays the most, because they are where volume is highest. Routing across Opus, Sonnet, and Haiku so that only the work that needs the top model gets it can take aggregate spend forty to seventy percent below a uniform top model assumption, and that reduction is largest exactly when volume peaks. Prompt caching at up to ninety percent off on shared context flattens the cost of high frequency seasonal workloads. Batch at half rate on bulk seasonal jobs, such as a quarter end document run, cuts the cost of the predictable peaks.

This means the right sequence is to forecast the seasonal shape, apply the optimization you will actually run, and then size the commitment against the optimized seasonal curve. A buyer who sizes against the unoptimized peak commits far too large and pays for capacity the optimization will erase. A buyer who optimizes first commits to a smaller, better shaped number and keeps the savings. Seasonality and optimization are the same planning exercise viewed from two angles, and treating them together is what produces a commitment that fits both the size and the shape of a real year. Our token optimization playbook includes the seasonality method alongside the optimization levers it depends on, with the numbers behind each.

Read the pillar guide

The token optimization playbook: cut Claude spend without cutting usage →

Plan for the peaks and the troughs.

Download the token optimization playbook and see the exact levers we pull to cut aggregate Claude spend 40 to 70 percent.

Download the Playbook

The Counteroffer

Weekly intelligence on Anthropic pricing moves and the buyer side counters that work.

Get a Quote · Book a Strategy Call · The Counteroffer · Blog · How It Works · Pricing · LinkedIn · New York · London Not affiliated with Anthropic PBC. Independent buyer side advisory only.