Rent, Reserve or Own Intelligence?

The enterprise infrastructure question is no longer simply cloud or on-premises. It is how much intelligence capacity to rent, reserve, manage or own for each workload.

Sourcing modes compared across decision factors. The unit of optimisation is the workload.

From software sourcing to intelligence sourcing

Enterprises once chose whether to buy software or build it.

They now choose how to source machine intelligence:

Packaged SaaS
Public API
Reserved model or throughput capacity
Hyperscaler managed service
Neocloud capacity
Private cloud
Colocation
Owned AI factory
Edge or device inference
Hybrid combinations

The same model capability can arrive through radically different economic structures.

Four sourcing modes

Rent

Pay per token, request, action or usage.

Best for:

Pilots
Uncertain demand
Low volume
Rapid access
Model experimentation
Variable workloads

Risks:

Price exposure
Limited infrastructure control
Opacity
Vendor dependency
Data and residency constraints

Reserve

Commit to capacity, throughput or spend for a period.

Best for:

Growing but still variable workloads
Predictable base demand plus bursts
Need for discount without infrastructure ownership
Enterprise service levels

Risks:

Commitment waste
Forecast error
Model change during contract
Lock-in

Managed capacity

Use dedicated or specialised infrastructure operated by a provider.

Best for:

Performance-sensitive workloads
Regional or security requirements
Elastic access with more control
Organisations lacking facilities capability

Risks:

Shared responsibility complexity
Provider concentration
Contract and capacity constraints
Migration difficulty

Own

Acquire and operate infrastructure directly or through colocation.

Best for:

Large predictable demand
High utilisation
Material latency or sovereignty needs
Stable model strategy
Strong platform capability

Risks:

Capital
Stranded capacity
Rapid obsolescence
Power and cooling
Staffing
Integration
Slower deployment
Refresh cycles

What Deloitte's model shows

Deloitte models API, neocloud and self-hosted AI-factory economics across rising token volume.

The useful lesson is the shape of the curve.

The dangerous lesson is treating 84 billion as a universal threshold.

Why the threshold moves

Model and quality

A smaller open model and a frontier reasoning model do not provide equivalent work.

Input and output mix

Output tokens can carry different economics from input and cached tokens.

Utilisation

Owned infrastructure wins only if it is productively used.

Demand shape

Steady demand supports capacity ownership. Spiky demand supports renting.

Hardware generation

Performance and energy efficiency change rapidly.

Contract terms

Reserved API or neocloud pricing can change the comparison.

Full cost

Deloitte's model excludes storage and egress, security, one-time integration and staffing. An enterprise decision must include them.

Risk and sovereignty

Control may justify higher nominal cost.

Exit and optionality

The ability to switch models or providers has economic value.

Workload decision profile

For each material workload, record:

Business outcome
Criticality
Volume
Growth
Seasonality
Context size
Reasoning depth
Latency
Quality threshold
Data sensitivity
Residency
Integration
Availability
Model portability
Human review
Expected lifespan
Full cost
Exit requirement

Decision matrix

Demand uncertainty

Factor: Demand uncertainty
Rent: Strongest
Reserve: Medium
Managed capacity: Medium
Own: Weakest

Speed to start

Factor: Speed to start
Rent: Strongest
Reserve: Strong
Managed capacity: Strong
Own: Weakest

Unit cost at high utilisation

Factor: Unit cost at high utilisation
Rent: Weakest
Reserve: Medium
Managed capacity: Strong
Own: Potentially strongest

Capital exposure

Factor: Capital exposure
Rent: Lowest
Reserve: Low
Managed capacity: Low-medium
Own: Highest

Control

Factor: Control
Rent: Lowest
Reserve: Medium
Managed capacity: High
Own: Highest

Model flexibility

Factor: Model flexibility
Rent: High initially
Reserve: Medium
Managed capacity: Medium
Own: Depends on stack

Sovereignty

Factor: Sovereignty
Rent: Low-medium
Reserve: Medium
Managed capacity: High
Own: Highest

Obsolescence risk

Factor: Obsolescence risk
Rent: Provider
Reserve: Shared
Managed capacity: Provider/shared
Own: Enterprise

Skills burden

Factor: Skills burden
Rent: Lowest
Reserve: Low
Managed capacity: Medium
Own: Highest

Sourcing mode comparison across key decision factors
Dimension	Demand uncertainty	Speed to start	Unit cost at high utilisation	Capital exposure	Control	Model flexibility	Sovereignty	Obsolescence risk	Skills burden
Factor	Demand uncertainty	Speed to start	Unit cost at high utilisation	Capital exposure	Control	Model flexibility	Sovereignty	Obsolescence risk	Skills burden
Rent	Strongest	Strongest	Weakest	Lowest	Lowest	High initially	Low-medium	Provider	Lowest
Reserve	Medium	Strong	Medium	Low	Medium	Medium	Medium	Shared	Low
Managed capacity	Medium	Strong	Strong	Low-medium	High	Medium	High	Provider/shared	Medium
Own	Weakest	Weakest	Potentially strongest	Highest	Highest	Depends on stack	Highest	Enterprise	Highest

A hybrid portfolio

A sensible enterprise portfolio may contain:

SaaS for commoditised capabilities
APIs for exploration
Reserved capacity for scaled shared services
Managed regional infrastructure for sensitive workloads
Owned capacity for stable, material workloads
Edge models for latency and privacy

The value-aware crossover

A cost-only crossover asks:

"When is owned capacity cheaper per token?"

A value-aware crossover asks:

"When does a sourcing model produce the best cost per successful outcome, at the required quality, latency, control and resilience?"

A cheaper stack may deliver:

Lower quality
Slower iteration
More engineering burden
Less model choice
Delayed market entry

A more expensive API may create strategic value through speed and optionality.

Governance

Quarterly capacity review

Demand versus forecast
Utilisation
Unit cost
Quality
Workload placement
Vendor changes
Obsolescence
Sovereignty
Exit readiness

Trigger events

Sustained utilisation threshold
Cost crossover
Material price change
Model change
Regulation
Acquisition
New data class
Service incident
Major demand expansion

Future tool

A "Rent, Reserve or Own Intelligence" calculator should include:

Inputs:

Monthly input, output and cached tokens
Growth
Peak-to-average ratio
Model classes
Target utilisation
API and capacity price
Hardware and facilities
Staffing
Storage and network
Security and compliance
Capital cost
Refresh period
Migration and exit cost
Quality adjustment

Outputs:

Three-year TCO range
Crossover range
Utilisation sensitivity
Risk-adjusted recommendation
Caveats
Hybrid split

Conclusion

The right question is not whether owned AI is cheaper than an API.

It is which sourcing model best matches the workload's economics, risk and strategic importance, and how that answer changes as demand evolves.

Sources and further reading

Deloitte, The pivot to tokenomics: Navigating AI's new spend dynamics, pp. 6-19 and model limitations pp. 25-26
FinOps Foundation, "GenAI FinOps: How Token Pricing Really Works"
European Commission, "AI Factories"
AI TCO Framework
Token Economics
Inference Cost Crisis

Rent, Reserve or Own Intelligence?

Rent, Reserve or Own Intelligence?

The enterprise infrastructure question is no longer simply cloud or on-premises. It is how much intelligence capacity to rent, reserve, manage or own for each workload.

From software sourcing to intelligence sourcing

Four sourcing modes

Rent

Reserve

Managed capacity

Own

What Deloitte's model shows

Why the threshold moves

Model and quality

Input and output mix

Utilisation

Demand shape

Hardware generation

Contract terms

Full cost

Risk and sovereignty

Exit and optionality

Workload decision profile

Decision matrix

Demand uncertainty

Speed to start

Unit cost at high utilisation

Capital exposure

Control

Model flexibility

Sovereignty

Obsolescence risk

Skills burden

A hybrid portfolio

The value-aware crossover

Governance

Quarterly capacity review

Trigger events

Future tool

Conclusion

Sources and further reading

Continue exploring