When AI spend surfaces as a material cost in the CFO conversation, the first organizational move is usually to hand it to the cloud FinOps team. The logic is straightforward: they understand cost management, they have relationships with vendors, they know how to run chargeback, and they've already built the tooling infrastructure for cloud cost governance.

The problem is that cloud FinOps teams arrive at AI spend with a framework that was built for a fundamentally different cost structure. Some of what they know transfers. A lot of it doesn't. The result is five predictable mistakes that leave organizations with the appearance of AI cost governance but not the substance of it.

Mistake 1: Treating Tokens Like Compute Units

The foundational mental model of cloud FinOps is compute: you provision resources, you run workloads on them, you pay for what you use. The optimization moves in cloud cost management — right-sizing, reserved instances, spot pricing — are all predicated on the idea that cost is attached to a resource that someone provisioned.

Tokens don't work like this. You don't provision tokens. There's no "token instance" running in the background accruing charges. Tokens are consumed per request, and the consumption varies based on inputs and outputs that are only known at request time. A single API call might cost $0.0001 or $0.10 depending on the prompt length, the model selected, and the response length — none of which is set in advance the way a VM size is set in advance.

This changes the optimization question entirely. In cloud FinOps, you optimize by changing what you've provisioned. In AI FinOps, you optimize by changing what you're sending — the length of your prompts, the model you're calling, whether you're caching responses, whether you're batching requests. These are application-level decisions, not infrastructure decisions.

Cloud FinOps teams that apply the compute mental model to AI spend will focus on negotiating better rates with providers, auditing for unused API keys, and trying to apply reserved capacity models where they exist. These are legitimate activities but they're not where the material optimization opportunity lies. The real opportunity is in prompt engineering, model selection, and application architecture — and those require a fundamentally different set of skills and a much closer relationship with engineering.

Mistake 2: Assuming Tagging Can Replace Instrumentation

In cloud cost management, attribution works through tagging. You tag your infrastructure resources — EC2 instances, S3 buckets, RDS clusters — with team, product, and environment labels, and your cost management platform reads those tags to produce attribution reports. It's imperfect but it's operational: you can get 80-90% of spend attributed with a reasonable tagging effort.

AI API calls cannot be tagged this way. There is no infrastructure resource to attach a tag to. An API call is a network request that returns a response — the provider's billing system records total token consumption, not metadata about where the request came from or why it was made.

Attribution for AI spend requires instrumentation: you have to tag the API call itself, either by encoding metadata in request headers (where providers support it), by using separate API keys for separate applications, or by collecting attribution metadata in your own observability layer as requests are made. None of this is a configuration change in a tagging policy. It requires engineering work in every application that makes AI API calls.

Cloud FinOps teams that try to solve AI attribution through tagging will get close but not close enough. They'll be able to attribute spend to a cloud account or a billing organization but not to the specific feature or team that drove it. That's the attribution granularity that matters for actual cost management, and it lives one level below where tagging can reach.

Mistake 3: Ignoring Model Selection as a Cost Lever

In cloud FinOps, the "product" — compute, storage, network — is relatively homogeneous within a tier. An M5.xlarge is an M5.xlarge. Cost optimization is about using less of it or using it more efficiently, not about choosing between fundamentally different products with 10x cost differences.

In AI, model selection is the single largest cost lever available. The cost difference between a frontier model and a smaller, task-appropriate model for the same workload can be 10-50x. A team running GPT-4o for a classification task that GPT-4o mini handles equally well is burning 10-20x more per request than necessary. A team using Claude Opus for document summarization that Claude Haiku handles adequately is similarly miscalibrated.

Model selection decisions are made by engineering, not by FinOps. But FinOps teams that don't flag model selection as a cost lever miss the most impactful optimization available. The cloud FinOps playbook has no equivalent — there's no "right-size your model" best practice because the concept doesn't translate.

An effective AI FinOps practice has to develop the capability to audit model selection across the organization's AI usage: where are frontier models being used for tasks that smaller models can handle? That requires both technical knowledge of model capabilities and access to usage data at the model-and-use-case level. Cloud FinOps teams typically have neither.

Mistake 4: Applying Cloud Budgeting Timelines to AI Spend

Cloud spend is relatively predictable on a monthly basis. You know what you've provisioned. You can forecast within reasonable bands. Monthly budget reviews work because the variance between months is typically explained by predictable events: a new service launched, a workload migrated, a contract renegotiated.

AI spend can move 10x in a day. A feature that goes to production, a retry loop that runs unchecked, a viral moment that sends unexpected traffic through an LLM-backed endpoint — these are events that create cost spikes that are orders of magnitude faster than any monthly budget review cycle can catch.

Cloud FinOps teams that monitor AI spend on monthly or even weekly cycles will consistently discover problems after they've already happened. The appropriate monitoring frequency for AI spend is daily at minimum and real-time alerting for anomalous velocity. This is a different operating model than cloud cost management, which tolerates longer review cycles because the cost surface is more stable.

This isn't a criticism of cloud FinOps teams — it's a consequence of them being given a tool designed for monthly variance and asked to apply it to daily variance. The tooling and processes need to change, not just the team.

Mistake 5: Treating AI Spend as a Single Vendor Category

In cloud FinOps, vendor relationships are relatively concentrated. Most organizations have a primary cloud provider and maybe one or two secondary ones. Vendor management is about negotiating enterprise agreements, managing commit programs, and understanding the billing structure of a small number of large relationships.

AI spend is structurally fragmented. A single organization might be using OpenAI for some applications, Anthropic for others, Mistral for lower-cost tasks, Amazon Bedrock for enterprise security reasons, and a handful of specialized providers for specific capabilities. These relationships can exist simultaneously in different teams, at different pricing tiers, with different billing structures and API conventions.

Cloud FinOps teams that approach AI spend as a vendor management problem will negotiate a better OpenAI contract while missing that 40% of the organization's AI spend is going through three other providers they didn't know existed. Effective AI spend management requires a cross-provider view, and building that view requires aggregating billing data from multiple sources that don't naturally connect.

The tooling problem here is real. Cloud cost management platforms understand cloud. Purpose-built AI cost management platforms like Oberhahn understand the fragmented AI vendor landscape and are designed to normalize spend data across providers into a single view. Without that normalization layer, even a capable FinOps team is working from an incomplete picture.

What Cloud FinOps Teams Actually Get Right

This isn't an argument that cloud FinOps teams have nothing to contribute to AI cost management. They bring capabilities that are genuinely valuable: they understand financial governance processes, they have credibility with finance and procurement, they know how to build chargeback frameworks, and they have experience with the change management required to shift engineering behavior around cost.

The organizational path that works best is not to replace cloud FinOps teams with AI specialists, but to extend them — either by embedding AI-knowledgeable engineers into the FinOps function or by creating a close collaboration between the AI platform team (which has the technical knowledge) and the FinOps team (which has the financial governance knowledge).

What has to change is the tooling, the processes, and the mental models. Cloud FinOps teams that bring intellectual humility to AI spend — acknowledging that their playbook doesn't transfer cleanly — will move faster and produce better outcomes than teams that try to force-fit AI into their existing framework.

The Five Gaps, Summarized

Cloud FinOps AssumptionWhy It Breaks in AIWhat to Do Instead
Cost attaches to provisioned resourcesTokens are consumed per request, not provisionedInstrument at the application layer, not the resource layer
Tags drive attributionAPI calls can't be tagged through infrastructure configRequire per-application keys and request-level instrumentation
Optimization means right-sizingModel selection is the dominant cost leverAudit model-to-task fit across all AI usage
Monthly review cycles are sufficientAI spend can spike 10x in hoursDaily monitoring, real-time velocity alerts
Vendor concentration enables managementAI spend is structurally fragmented across providersBuild or buy a cross-provider aggregation layer

The organizations that get AI cost management right will be the ones that adapt their FinOps practice to AI's actual characteristics rather than forcing AI spend into a framework built for a different cost structure. Cloud FinOps teams are a natural starting point — but the destination requires building something new.