Skip to content

Flagship framework

AI TCO Framework

A practical framework for understanding the full cost of enterprise AI across model access, infrastructure, governance, operations, and portfolio oversight.

AI total cost of ownership is not a procurement detail. It is the operating architecture of the capability itself.

Enterprise frameworkTechnologyFinanceFinOpsTBM

Framework visual

7 layers

Portfolio oversightL7
Operations and supportL6
Risk and governanceL5
Application orchestrationL4
Data and retrievalL3
Model access and hostingL2
Infrastructure and computeL1

The higher the layer, the easier it is to forget in business cases. The lower the layer, the easier it is to under-price in platform decisions.

Why this matters

AI cost decisions are rarely isolated technical choices. They reshape platform burden, control obligations, and the economics of scaling demand.

  • Model price alone does not explain enterprise AI cost.
  • Deployment choice determines where hidden burden accumulates.
  • TCO becomes useful when cost layers can be linked to portfolio decisions.

The cost model problem

The model invoice is typically 10–20% of true enterprise AI cost.

For an enterprise running production AI use cases on commercial foundation models and cloud infrastructure, infrastructure, people, integration, and governance routinely outweigh the vendor bill that dominates internal cost discussions. IDC warns that large enterprises risk materially underestimating AI infrastructure spend as estates scale. The same pattern appears in FinOps Foundation data: AI cost governance has expanded rapidly, but forecasting and allocation — the disciplines that require a full cost model — remain undermatured relative to basic spend reporting.

Why enterprise AI TCO needs a wider frame

The model invoice is visible. The operating system around the model is where enterprise economics usually become distorted.

Why enterprise AI TCO needs a wider frame

Most AI business cases are built on what is easiest to price: the API rate card, the SaaS premium, or the hosting bill. Those figures are real, but they describe only the model layer — one part of a seven-layer cost structure. IDC reported in 2025 that more than 56% of AI tool spending sits outside formal IT budgets. As estates scale, that invisible spend creates cost curves that finance teams did not model and cannot explain.

The consequence is predictable. Organisations approve investment based on narrow cost estimates, then discover in year two that infrastructure, people, governance, and integration obligations dwarf the original vendor invoice. The TCO framework exists to stop that pattern before it starts: to give leaders a complete picture of what a production AI capability actually costs, across all seven layers, before capital is committed.

The 7-layer AI cost stack

The stack shows where burden sits, how it grows, and why two deployments with similar model cost can have very different enterprise economics.

The 7-layer AI cost stack

Layer 1

Infrastructure

Illustrative share: 30% to 45% of total AI TCO for a cloud-heavy enterprise estate. This layer expands quickly with concurrency, sovereignty, resilience, and GPU intensity.

Includes

  • accelerated compute and GPU capacity
  • storage, networking, and runtime services
  • clusters, containers, and platform environments
  • regional redundancy and resilience

Layer 2

Data and context

Illustrative share: 10% to 18% of total AI TCO. This layer often grows when retrieval, metadata quality, and refresh requirements become serious.

Includes

  • data ingestion and transformation
  • retrieval pipelines and vector systems
  • indexing, metadata, and provenance
  • refresh and context-quality controls

Layer 3

Models

Illustrative share: 10% to 20% of total AI TCO. This is the most visible cost layer, but often not the dominant one.

Includes

  • API tokens or requests
  • SaaS AI licence premiums
  • reserved capacity or committed spend
  • fine-tuning or model adaptation usage

Layer 4

Integration and workflow redesign

Illustrative share: 10% to 20% of total AI TCO. This is where a useful demo becomes a real operating capability.

Includes

  • application design and product changes
  • system integration and API work
  • identity, access, and policy enforcement
  • workflow redesign and adoption effort

Layer 5

People and capability

Illustrative share: 25% to 35% of total AI TCO. AI-specific labour often remains the single most undercounted cost category.

Includes

  • platform engineers and product engineers
  • data and ML specialists
  • AI governance and assurance labour
  • training, enablement, and operating support staff

Layer 6

Governance, safety, and compliance

Illustrative share: 8% to 15% of total AI TCO, often higher in regulated settings.

Includes

  • evaluation and benchmark design
  • red teaming and human review
  • policy, auditability, and compliance support
  • security, legal, and risk coordination

Layer 7

Operations and portfolio oversight

Illustrative share: 5% to 10% of total AI TCO. This layer determines whether AI remains governable as a service and a portfolio.

Includes

  • monitoring and observability
  • incident response and vendor management
  • allocation, showback, and chargeback logic
  • portfolio review and value assurance

Deployment models and how the stack changes

No deployment model removes the stack. It only redistributes who carries it and which layers dominate.

Deployment models and how the stack changes

Comparison of deployment models showing which cost layers dominate the TCO profile

SaaS AI

Dominant cost layers
Models, integration, people, governance
Primary governance disciplines
Procurement, ITFM, TBM, risk
Typical economic pattern
Fastest route to value, but licence premiums and workflow sprawl can accumulate quietly
Best fit
Packaged use cases where speed and adoption support matter more than deep control

API consumption

Dominant cost layers
Infrastructure, models, integration, operations
Primary governance disciplines
FinOps, engineering, product, security
Typical economic pattern
Flexible but highly sensitive to usage behaviour, routing, and orchestration design
Best fit
Differentiated application workflows where the organisation wants product control

Fine-tuned commercial

Dominant cost layers
Models, data and context, evaluation, people
Primary governance disciplines
Procurement, FinOps, engineering, risk
Typical economic pattern
Higher stewardship cost justified only where domain adaptation materially changes value
Best fit
Sensitive or domain-specific use cases with strong quality requirements

Open-source self-hosted

Dominant cost layers
Infrastructure, people, operations, governance
Primary governance disciplines
FinOps, TBM, engineering, architecture
Typical economic pattern
Lower vendor dependence but significantly higher internal platform burden
Best fit
Scale, sovereignty, or control requirements with strong platform maturity

Custom-built

Dominant cost layers
Infrastructure, people, data, evaluation, governance
Primary governance disciplines
SPM, TBM, CFO, engineering leadership
Typical economic pattern
Strategic posture rather than a convenience choice; highest long-run burden
Best fit
Only where differentiation or control justifies sustained capital and talent commitment

Governance mapping by layer

Each cost layer has a natural primary discipline, but no single discipline can govern the whole stack alone.

Governance mapping by layer

  • Layer 1 Infrastructure: FinOps is primary for live consumption governance; TBM helps allocate and translate shared platform cost.
  • Layer 2 Data and context: Engineering and architecture lead design; TBM and ITFM help model the recurring service burden.
  • Layer 3 Models: Procurement and FinOps should jointly govern rate structures, committed use, caching, and provider choice.
  • Layer 4 Integration and workflow redesign: Product, engineering, and SPM govern whether the capability is worth embedding.
  • Layer 5 People and capability: ITFM and SPM are essential for labour planning, staffing, and portfolio sequencing.
  • Layer 6 Governance, safety, and compliance: Risk, security, legal, and service owners govern the control burden.
  • Layer 7 Operations and portfolio oversight: FinOps, TBM, service management, and portfolio leadership need a shared review cadence.

TCO traps practitioners recognise

The most common traps are not arithmetic mistakes. They are recurring structural misunderstandings.

TCO traps practitioners recognise

  1. The Jevons paradox: as per-token cost falls, consumption often rises faster than cost declines, especially in agentic systems that consume five to thirty times more tokens per task.
  2. Shadow AI: IDC reports that more than half of AI tool spending sits outside formal IT budgets, which makes enterprise TCO systematically incomplete.
  3. The inference iceberg: at scale, inference can overtake training economics quickly; the ongoing service burden becomes the main bottleneck.
  4. Shared platform cost socialisation: foundational platform spend is spread across the estate in ways that make individual use cases look cheaper than they are.
  5. Governance overhead blindness: red teaming, security review, human oversight, and compliance work are real cost layers, not optional extras.
  6. False build-versus-buy comparisons: leaders compare vendor price to infrastructure price while ignoring people, operations, and control requirements.
  7. Pilot-era denominators: business cases stay anchored to early assumptions long after the service has acquired production obligations.

What this means for enterprise decision-makers

TCO should change how leaders ask questions, not merely how they total numbers.

What this means for enterprise decision-makers

For CIOs, the implication is direct: architecture choices are economic choices. Selecting a deployment model, a platform vendor, or a context architecture without modelling layers 1 through 7 produces a cost structure that surprises the organisation rather than serving it.

For CFOs, the risk is that AI cannot be governed as a single spend category. It requires a cost model across infrastructure, integration, people, and governance before it can be compared against other investment options in credible terms.

For Heads of Engineering, the message is that product and workflow design are now cost design. Token consumption, context length, fallback patterns, and model routing are cost decisions with material budget consequences at scale.

For FinOps, TBM, ITFM, and SPM teams, the practical task is to connect variable consumption, shared capability investment, and portfolio proof in one management model. No single discipline spans all seven layers. Collaboration across them is the governance design choice, not an optional alignment exercise.

Related reading