The First 30 Days of AI Value Management

A field sequence, not a framework. Assumes a mid-to-large organisation with AI spend already running and nobody formally owning its economics. Everything here is doable with finance extracts, vendor portals and a spreadsheet; no tooling purchase required in month one.

The urgency is real. Uber reported exhausting its annual AI coding budget within four months, and the COO said the company could not yet draw a clear line between usage and value. The Wall Street Journal separately reported enterprise AI rationing as costs rose. The organisations that avoid this pattern will be those that built the value meter while the cost meter was still small.

Week 1: Find the money

Day 1-2. Name an owner. One person accountable for AI economics across the estate, with a mandate letter from the CFO or CIO. Not a committee. The rest of this playbook fails without this step, because every later action needs someone empowered to ask awkward questions across budget lines.

Day 2-5. Build the spend inventory. Every AI cost, in one register, in four categories:

Direct consumption - API and token spend, by provider and team
Licensed seats - copilots, coding assistants, AI tools with per-seat pricing
Embedded tiers - AI components inside SaaS contracts (ask vendors for the AI-attributable share in writing; log refusals as "opaque")
Infrastructure - GPU, hosting, model platform costs inside the cloud bill

Expect this to be harder than it sounds and revealing in itself: duplicate tools, departmental contracts finance has never categorised as AI, embedded tiers nobody chose. The total is your first published number. Most organisations have never seen it.

Real-world pattern: In 2026, Marc Benioff said Salesforce expected to spend about $300 million on Anthropic usage, largely for coding-related work.

Day 5. Pull the consumption telemetry. Provider dashboards, seat-activity reports, cloud cost tags. You are establishing what is measurable today, not building anything yet.

Week 2: Find the usage and the owners

Day 6-8. Map spend to teams and use cases. For each register line: who uses it, for what workflow, and who claimed it would deliver what. Where there was a business case, attach it. Where there wasn't, record "no business case" without ceremony - the gaps are findings, not accusations.

Day 8-10. Measure real adoption. Active users versus paid seats per tool. Repeat use, not log-ins. Flag every contract where adoption is under 40% of seats - renewal leverage later.

Adoption reality check: Microsoft reported continued growth in paid Microsoft 365 Copilot seats in early 2026, so the issue is not commercial collapse but value realisation. Independent evaluations, including the UK government trials, found self-reported time savings that did not translate into measured productivity. Seats are bought on potential and kept, or dropped, on measured reality.

Day 10. Identify the overlap. Tools answering the same job (general assistant ×2, coding assistant ×2, embedded copilot doing both). Don't consolidate yet; just establish the duplicate-spend number.

Week 3: Baselines and unit costs

Day 11-15. Capture baselines for the top three use cases by spend. Whatever pre-AI history exists - cycle times, tickets resolved per FTE, drafting turnaround - capture now, before it ages out of systems. For use cases already past rollout with no baseline, record that the productivity claim is untestable and decide whether a control-group comparison is still possible.

Without a pre-rollout baseline, "maybe implicitly there's more that is getting shipped" — the Uber COO's reported phrasing — becomes the strongest claim available. A company that measures almost everything about a ride still struggled to measure this because the value instrumentation was not designed in.

Day 15-18. Define one unit cost per major use case. Examples: cost per accepted code change; cost per resolved ticket (resolved, not deflected); cost per document processed, fully loaded with review time; cost per workflow completed for agentic processes. Imperfect definitions are fine; consistency matters more than precision in month one.

The hidden costs: At Uber, the visible invoice reportedly ran about $500-$2,000 per engineer per month. The hidden costs — review time of engineers checking agent-generated code, rework when agent-committed code needs correction, orchestration engineering, and the opportunity cost of attention shifting to prompt-wrangling — never appeared on the vendor invoice but belong in cost per unit of shipped work.

Day 18-20. Compute attribution coverage, version one. Classify every register line: outcome-linked (a measured result exists), activity-linked (usage data only), unattributed (neither). Publish the three percentages. This is the headline metric the whole discipline improves over time, and version one is allowed to be embarrassing.

Attribution coverage is the lead metric because it measures the organisation's ability to connect AI spend to measured outcomes, which is precisely what the AI value gap describes.

Week 4: Forecast, controls, and the first decisions

Day 21-24. Build the first consumption forecast. Monthly, by category, ninety days out, with stated assumptions about adoption and agentic intensity. Set variance triggers: at 15% over forecast, the owner investigates; at 30%, spend approval tightens automatically.

Agentic workflows multiply tokens per task - an agent iterating on a codebase resends growing context with every step, so the marginal task is more expensive than the average task. This is why static annual budgets fail: per-token prices may fall, but agentic systems consume so many more tokens per task that total spend rises anyway.

Day 24-26. Set guardrails where consumption is unbounded. Per-task and per-run budgets on any autonomous agent, with automatic halt. Per-user anomaly alerts on coding tools (investigate the top decile before capping it - your best users and your wasteful ones both live there).

A standing rule: no consumption-based leaderboards or usage targets, anywhere, ever.

The leaderboard trap: Uber reportedly ran internal leaderboards ranking AI coding tool usage competitively. Once usage is ranked, usage data stops telling you much about productivity, which is exactly the data the company then needed to justify the spend it had encouraged. Some share of consumption may have become performative, buying status rather than value.

Day 26-28. Fix the procurement posture. Standard clauses from now on: AI-attributable price component stated in every SaaS contract; notice period for pricing-model changes (Cursor's June 2025 repricing is the cautionary precedent); usage data export rights; no auto-renewal of AI tiers with sub-40% adoption.

Day 28-30. Hold the first value review and publish the pack. One hour, CFO or delegate in the room. Contents: total AI spend and trend; adoption versus seats; duplicate-spend estimate; attribution coverage v1; top-three unit costs; forecast with triggers; three decisions (typically: one tool to consolidate, one use case to baseline properly, one contract to renegotiate).

Schedule it monthly. The pack is the discipline; the meeting is just where it becomes visible.

What you have at day 30

A named owner. A total spend number. An adoption-versus-paid map. Baselines for the biggest use cases. Three unit costs. Attribution coverage, version one. A forecast with tripwires. Agent guardrails. Procurement clauses. A monthly review with decision authority.

This is not AI Value Management complete. This is AI Value Management started, with the minimum viable governance to prevent the Uber pattern while building the data foundation for everything that follows.

What this deliberately leaves for later

Chargeback and showback models (month 2-3, once allocation data is trustworthy). Tooling selection for AI cost management (buy after you know your requirements, not before). Portfolio-level investment reprioritisation (needs two or three review cycles of data). Value realisation audits of pre-existing business cases (month 3, with the baseline question settled).

The three failure modes of month one

The inventory stalls in pursuit of completeness. Ship the 80% version at day 5; the register is a living document.
The exercise becomes a policing project. The owner's posture is economist, not auditor - teams that fear the data will starve it.
Measurement becomes the deliverable. The day-30 review must make three real decisions, or the organisation learns that AI Value Management produces packs rather than consequences.

Real-world precedents

When AI agents fail economics: Starbucks retired or replaced its inventory management AI agent after reported accuracy problems including miscounts and mislabelling. AI Economics Hub reads this as an economic exit: a capability retired once output quality no longer justified its operating burden. We have no public confirmation of routine manager overrides or a formal ROI analysis, so those should be treated as inference, not fact.

When trivial tasks game the system: Usage telemetry analysed by task type and outcome, not just volume, can reveal patterns where AI tools are used for low-value work to meet adoption targets or demonstrate engagement. This is a known risk in any usage-incentivised system.

Optimist

Sceptic

The Optimist's Case

The Sceptic's Case

Where this fits in the broader discipline

This 30-day sequence addresses the bottom two rungs of the five-level distinction ladder: activity (what is being used) and adoption (who is using it, how much). Productivity measurement, value realisation, and strategic impact assessment require the data foundation this playbook builds, but they are month 2-6 work, not month 1.

The FinOps Foundation's 2025 guidance on AI scopes emphasises that AI cost management must extend beyond infrastructure to include model and inference costs, data and pipeline costs, and governance costs. This playbook's four-category spend inventory maps directly to that expanded scope.

The NIST AI Risk Management Framework treats economic sustainability as a governance concern, not just a finance concern. A named owner with CFO or CIO mandate (day 1-2) positions AI value management as governance, which is where it belongs.

References and further reading

BCG, The Widening AI Value Gap: Build for the Future, 2025
BCG, From Potential to Profit: Closing the AI Impact Gap, 2024
FinOps Foundation, FinOps for AI: Scopes and Capabilities, 2025
AWS, Closing the AI Value Gap, 2024
UK Government Digital Service, Microsoft 365 Copilot Experiment: Cross-Government Findings Report, 2024, https://www.gov.uk/government/publications/microsoft-365-copilot-experiment-cross-government-findings-report
NIST, AI Risk Management Framework, 2023
Fortune, "Uber COO: AI spending on tokens like Claude Code is hard to justify", 26 May 2026, https://fortune.com/2026/05/26/uber-coo-ai-spending-tokens-claude-code/
Wall Street Journal, "Corporate America Is Starting to Ration AI as Cost Skyrockets", May 2026