A field sequence, not a framework. Assumes a mid-to-large organisation with AI spend already running and nobody formally owning its economics.
Evidence
The urgency is real.
Evidence
Evidence
Interpretation
Week 1: Find the money
Day 1-2. Name an owner.
Evidence
Day 2-5. Build the spend inventory. Every AI cost, in one register, in four categories:
- Direct consumption - API and token spend, by provider and team
- Licensed seats - copilots, coding assistants, AI tools with per-seat pricing
- Embedded tiers - AI components inside SaaS contracts (ask vendors for the AI-attributable share in writing; log refusals as "opaque")
- Infrastructure - GPU, hosting, model platform costs inside the cloud bill
Interpretation
Evidence
Real-world pattern: In 2026, Marc Benioff said Salesforce expected to spend about $300 million on Anthropic tokens in 2026, largely for coding-related work.
Day 5. Pull the consumption telemetry. Provider dashboards, seat-activity reports, cloud cost tags.
Evidence
Week 2: Find the usage and the owners
Day 6-8. Map spend to teams and use cases. For each register line: who uses it, for what workflow, and who claimed it would deliver what. Where there was a business case, attach it.
Evidence
Day 8-10. Measure real adoption.
Evidence
Adoption reality check: Microsoft reported that enterprise Copilot subscription cancellations reached significant levels in early 2026, with organisations citing low adoption and unclear value as primary reasons. The pattern: seats purchased on potential, cancelled on measured reality.
Day 10. Identify the overlap. Tools answering the same job (general assistant ×2, coding assistant ×2, embedded copilot doing both).
Interpretation
Week 3: Baselines and unit costs
Day 11-15. Capture baselines for the top three use cases by spend. Whatever pre-AI history exists - cycle times, tickets resolved per FTE, drafting turnaround - capture now, before it ages out of systems.
Evidence
Interpretation
Day 15-18. Define one unit cost per major use case. Examples: cost per accepted code change; cost per resolved ticket (resolved, not deflected); cost per document processed, fully loaded with review time; cost per workflow completed for agentic processes.
Evidence
The hidden costs: At Uber, the visible API invoice ran $500-$2,000 per engineer per month. The hidden costs - review time of engineers checking agent-generated code, rework when agent-committed code needs correction, orchestration engineering, opportunity cost of attention shifting to prompt-wrangling - never appeared on the Anthropic invoice but belong in cost per unit of shipped work.
Day 18-20. Compute attribution coverage, version one. Classify every register line: outcome-linked (a measured result exists), activity-linked (usage data only), unattributed (neither).
Evidence
Interpretation
Week 4: Forecast, controls, and the first decisions
Day 21-24. Build the first consumption forecast. Monthly, by category, ninety days out, with stated assumptions about adoption and agentic intensity.
Evidence
Interpretation
Evidence
Day 24-26. Set guardrails where consumption is unbounded.
Evidence
Evidence
The leaderboard trap: Uber ran internal leaderboards ranking AI coding tool usage competitively. Once usage is ranked, usage data stops telling you about productivity, which is exactly the data Uber then needed to justify the spend it had encouraged. Some unknown share of consumption became performative, buying status rather than value.
Day 26-28. Fix the procurement posture. Standard clauses from now on: AI-attributable price component stated in every SaaS contract; notice period for pricing-model changes (Cursor's June 2025 repricing is the cautionary precedent); usage data export rights;
Evidence
Day 28-30. Hold the first value review and publish the pack. One hour, CFO or delegate in the room. Contents: total AI spend and trend; adoption versus seats; duplicate-spend estimate; attribution coverage v1; top-three unit costs; forecast with triggers; three decisions (typically: one tool to consolidate, one use case to baseline properly, one contract to renegotiate).
Evidence
What you have at day 30
A named owner. A total spend number. An adoption-versus-paid map. Baselines for the biggest use cases. Three unit costs. Attribution coverage, version one. A forecast with tripwires. Agent guardrails. Procurement clauses. A monthly review with decision authority.
Interpretation
What this deliberately leaves for later
Evidence
The three failure modes of month one
- The inventory stalls in pursuit of completeness.
Evidence
Ship the 80% version at day 5; the register is a living document. - The exercise becomes a policing project.
Interpretation
The owner's posture is economist, not auditor - teams that fear the data will starve it. - Measurement becomes the deliverable.
Evidence
The day-30 review must make three real decisions, or the organisation learns that AI Value Management produces packs rather than consequences.
Real-world precedents
When AI agents fail economics: Starbucks retired its inventory management AI agent after discovering the system's recommendations were being routinely overridden by store managers who understood local context the model missed. The agent consumed resources but delivered no measurable improvement over human judgment. The retirement decision came only after establishing baseline performance metrics.
When trivial tasks game the system: Amazon's internal AI usage reportedly included significant volumes of trivial-task gaming, where employees used AI tools for low-value work to meet adoption targets or demonstrate engagement. The pattern emerged only after usage telemetry was analysed by task type and outcome, not just volume.
Optimist
Sceptic
Synthesis
House viewWhere this fits in the broader discipline
Evidence
Interpretation
Evidence
Interpretation
References and further reading
- BCG, The Widening AI Value Gap: Build for the Future, 2025
- BCG, From Potential to Profit: Closing the AI Impact Gap, 2024
- FinOps Foundation, FinOps for AI: Scopes and Capabilities, 2025
- AWS, Closing the AI Value Gap, 2024
- UK Government Digital Service, Microsoft 365 Copilot Experiment: Cross-Government Findings Report, 2024, https://www.gov.uk/government/publications/microsoft-365-copilot-experiment-cross-government-findings-report
- NIST, AI Risk Management Framework, 2023
- Fortune, "Uber COO: AI spending on tokens like Claude Code is hard to justify", 26 May 2026, https://fortune.com/2026/05/26/uber-coo-ai-spending-tokens-claude-code/
- Wall Street Journal, "Corporate America Is Starting to Ration AI as Cost Skyrockets", May 2026