When to Stop: The AI Initiative Autopsy

The asymmetry nobody talks about

Every serious AI investment in a large enterprise goes through some form of business case approval. The CFO asks about ROI. The CIO signs off on architecture. Procurement reviews vendor terms. A stage gate may exist before a pilot becomes a production deployment.

What happens when that initiative fails to deliver? In most organisations, the honest answer is: not very much.

This is not primarily a measurement problem. It is an incentive problem. The people who sponsor AI investments are the same people who evaluate whether those investments are working. The teams that build AI systems have professional reputations tied to their success. The governance structures designed to challenge poor-performing investments are the same ones that approved them in the first place. Stopping an AI initiative requires someone in authority to say, in effect, that a prior decision was wrong. Very few governance models create real incentive to do that.

The result is an enterprise AI portfolio that accumulates rather than curates. Organisations get better at adding AI investments than removing them. The good cases are diluted by the persistent ones.

What the economics of inaction actually look like

Consider a realistic pattern from a large financial services group. Over 18 months, the organisation approved 34 AI initiatives across customer service, risk, compliance, and back-office operations. By month 18, four had been formally concluded — two as successes, two as completed pilots that were not scaled. The remaining 30 were all described as "in progress," "building toward value," or "awaiting improved data quality."

A conservative estimate of the annual operating cost of those 30 initiatives — including allocated platform time, engineering support, governance overhead, and model consumption — is between £1.8M and £2.4M per year, excluding the original development investment. The question is not whether any of those 30 initiatives might eventually succeed. The question is whether they all deserve continued capital at the same rate, indefinitely, without an explicit decision point.

At Level 3 maturity and below — which is where most enterprises currently sit — the answer is effectively yes, because there is no portfolio mechanism for comparative challenge. Each initiative defends itself against its own original narrative rather than against competing uses of the same budget.

This is the cost of missing exit governance. It is not dramatic. There is no single large failure event. It is the quiet accumulation of small bad decisions sustained past the point where evidence would justify them.

Real-world examples of successful exits

McDonald's ended its AI drive-thru partnership with IBM in 2024 after a two-year pilot. The voice-ordering system showed technical promise but failed to demonstrate sufficient accuracy and customer satisfaction improvements to justify scaling. The decision to stop was a governance success, not a failure — it prevented years of additional investment in a capability that wasn't reaching its value threshold.

Starbucks discontinued its AI-powered inventory management system after determining that the complexity and ongoing maintenance costs exceeded the operational benefits. The company explicitly stated that the decision was based on ROI analysis, demonstrating mature exit governance that compared actual results against original business case assumptions.

Three failure patterns that survive because they look like something else

The sunk-cost trap

An AI initiative has consumed significant investment in platform setup, data preparation, and integration work. The workflow is live but under-adopted, and early performance data is mixed. The business case assumed 40% productivity improvement in a loan-processing function; actual improvement is running at around 8-12% and appears to be plateauing.

In most governance reviews, this case survives because the sunken cost — the platform work, the integration — is treated as a reason to continue rather than as a measurement of what has already been spent. The argument becomes: "We've invested too much to stop now." This is a logical fallacy, but it is also an institutionally convenient one because it avoids the decision.

The correct question is not what has been spent. It is what additional capital, at what probability of reaching the original value threshold, would be required to continue — and whether that expected value exceeds the next-best use of the same resource.

The narrative substitution trap

A generative AI implementation was deployed to improve the quality and speed of analyst report preparation. Adoption is measurable but the original economic case — 30% reduction in preparation time per report — has not been demonstrated after nine months. In response, the business sponsor has shifted the narrative. The initiative is now described as improving "output quality and consistency," "analyst capability development," and "competitive differentiation."

These may be real benefits. But none of them were in the original business case, none are being measured, and none of them justify the original investment level. The narrative has been substituted to make the initiative look successful without actually demonstrating the return that was promised.

Narrative substitution is the most socially difficult failure pattern to challenge because it requires someone in the governance chain to say explicitly: "That is not what we agreed to measure." This puts the governance function in confrontation with the business sponsor — which most governance functions are designed to avoid.

The shared-platform alibi

A customer-facing AI workflow has weak adoption and unclear ROI. The team argues that the value should not be evaluated at the use-case level because the initiative helped establish shared platform capabilities, common integration patterns, and reusable governance infrastructure. These platform benefits, the argument goes, will support multiple future use cases.

This argument is sometimes valid. Shared platform investment does create option value. But it is also the most durable alibi for underperforming initiatives, precisely because platform value is diffuse and hard to attribute cleanly. When every underperforming use case claims credit for platform-building, the portfolio governance function cannot assess where platform investment is genuinely creating reuse and where it is creating an excuse.

Distinguishing governance immaturity from structural value failure

The most important judgment call in AI portfolio management is whether an underperforming initiative is failing because of governance immaturity — weak measurement, poor adoption infrastructure, insufficient baseline — or because the value case is structurally weak.

These are different problems with different correct responses. Governance immaturity might be worth investing through. Structural failure should be stopped.

Four signals point toward structural failure rather than governance immaturity:

1. The value mechanism has not changed despite repeated redesign. If the initiative has been redesigned, reframed, or re-baselined more than once without improving the core value-creation signal, this is more consistent with structural weakness than with governance immaturity. Governance problems respond to governance investment. Structural problems do not.

2. The theoretical value has been demonstrated but not absorbed. Some AI systems demonstrably improve the speed or accuracy of individual tasks, but the business never realises the productivity gain because workflow, headcount, or decision-making structures did not change around the AI capability. This is not a measurement failure. It is a value-capture failure. The technology worked; the operating model did not adapt. This pattern rarely improves without a fundamental redesign of the surrounding process — which is a different and more expensive investment than continuing to fund the AI capability itself.

3. External conditions that justified the case have changed. Model costs, competitive dynamics, regulatory requirements, or organisational priorities may have shifted materially since the original investment decision. An AI use case that was defensible at 2024 inference prices and 2024 compliance requirements may not be defensible at current costs with current obligations. The original business case is not a perpetual justification.

4. Value ownership is genuinely contested. If the business unit that was supposed to realise value has stopped actively supporting the initiative — delegating it to IT, reducing allocated personnel, or simply deprioritising the workflow change that was supposed to capture the AI benefit — this is a practical withdrawal of the value commitment. Continuing to fund a capability that the consuming organisation has effectively abandoned is governance failure.

Rational persistence versus escalation of commitment

The hardest governance question in AI portfolio management is not whether to stop a failing initiative. It is whether to continue funding an initiative that has not yet succeeded but might with more time, better data, or improved adoption support.

This is where most AI governance breaks down. The distinction between rational persistence — continuing to invest because evidence suggests the path to value is credible — and escalation of commitment — continuing to invest because stopping would require admitting failure — is analytically clear but institutionally difficult to apply.

CEO investment intentions: BCG's 2026 AI Radar research found that more than 90% of surveyed CEOs plan to continue investing in AI even where next-year returns are uncertain or absent. Around 82% reported being more optimistic about AI return than in the prior year. This rising conviction creates a governance challenge: when leadership confidence is high and investment appetite is strong, portfolio discipline becomes harder to maintain. The risk is not that organisations stop investing in AI too early. The risk is that they continue investing in specific initiatives too long because the broader strategic conviction makes it difficult to challenge individual cases.

Continue criteria: when persistence is rational

An AI initiative should continue to receive funding when:

1. Uncertainty is being reduced. Each funding cycle produces new evidence that narrows the range of possible outcomes. The initiative may not yet be profitable, but the organisation is learning whether the value mechanism works, what adoption barriers exist, and what cost structure is sustainable. This learning has option value even if the current initiative does not scale.

2. Strategic option value remains credible. The initiative may not deliver immediate ROI, but it creates capabilities, data assets, or organisational learning that enable future opportunities. This option value must be explicit and quantified, not merely asserted. "We're building platform capabilities" is not sufficient. "We're building reusable data pipelines that will reduce the cost of the next three planned use cases by an estimated 40%" is.

3. The next evidence milestone is explicit. The initiative has a clear, time-bound milestone that will produce decisive evidence about whether the value case is achievable. This is not "we'll review progress in six months." This is "by Q2, we will have 200 active users and a measured 15% productivity improvement, or we will redesign or stop."

4. Downside is capped. The organisation has defined the maximum additional investment it is willing to make before requiring positive ROI. This cap should be explicit in the funding approval, not negotiated retrospectively when the initiative underperforms.

5. Foundations are reusable. Even if the specific use case fails, the data preparation, integration work, governance infrastructure, or organisational capability can be redeployed to other initiatives. This reuse potential should be documented, not assumed.

Stop or pause criteria: when persistence becomes escalation

An AI initiative should be stopped or paused when:

1. Assumptions remain unchanged despite new evidence. The business case was built on assumptions about adoption rate, productivity improvement, or cost structure. Those assumptions have been tested in production and found to be incorrect. Yet the initiative continues with the same assumptions rather than being redesigned or stopped. This is escalation of commitment, not learning.

2. Adoption has no accountable owner. The business unit that was supposed to adopt the AI capability has deprioritised it, reduced allocated personnel, or delegated it entirely to IT. Without an active adoption owner with authority and incentive to drive usage, the initiative cannot realise value regardless of technical quality.

3. Benefit cannot be converted. The AI system demonstrably improves task speed or quality, but the organisation cannot convert that improvement into realised value because workflow, headcount, or decision-making structures have not changed. Continuing to fund the AI capability without funding the operating model change that would capture its value is waste.

4. Cost or control requirements exceed value. The governance, monitoring, compliance, or quality-assurance overhead required to operate the AI system safely exceeds the economic benefit it produces. This is particularly common in regulated environments where the cost of control grows as the system scales.

5. Better alternatives exist. The organisation has identified a different approach — a different model, vendor, architecture, or workflow design — that is more likely to achieve the same outcome at lower cost or risk. Continuing the current initiative prevents reallocation of capital to the better alternative.

6. The initiative duplicates shared capability. The organisation has built or is building a shared platform, data layer, or governance infrastructure that makes the initiative's custom implementation redundant. Continuing both creates unnecessary cost and complexity.

Portfolio stop metrics

Organisations with mature AI portfolio governance track:

Percentage of initiatives paused or retired: A healthy portfolio should show regular exits, not only additions. A portfolio with zero stops over 12-18 months is almost certainly carrying underperforming investments.
Capital released: The total budget freed by stopping or pausing initiatives, available for reallocation to higher-value opportunities.
Duplicated spend avoided: The cost of redundant implementations prevented by stopping initiatives that duplicate shared capabilities.
Average time from failed threshold to stop decision: The lag between an initiative missing its agreed performance threshold and the formal decision to stop or redesign. Shorter lags indicate stronger governance discipline.

Finance AI retirement guidance: IBM's 2026 AI in Finance research recommends explicit criteria to pause or retire initiatives, including: adoption below threshold after defined period, cost per outcome exceeding target, benefit realisation below business case, better alternatives identified, or strategic priorities changed. The research emphasises that retirement decisions should be treated as portfolio discipline, not delivery failure. Organisations that stop underperforming initiatives quickly can reallocate capital to higher-value opportunities and maintain portfolio credibility.

The political economy of stopping

Understanding why organisations fail to stop AI initiatives requires understanding who loses from the decision.

The delivery team loses programme credibility and professional capital. The business sponsor loses budget they may not recover. The vendor relationship may be affected. The IT platform team may have to justify why shared infrastructure was built for a use case that was abandoned. The CAIO or AI programme office may be seen as having approved a failure.

These are all real costs, distributed across real people. The benefits of stopping — freed capital, clearer portfolio comparisons, honest evidence base — are diffuse and accrue to the organisation, not to any individual who bears the cost of the decision.

This is why stop decisions in AI portfolios are systematically underprovided. The incentive structure produces continuation bias. Fixing this requires either changing the incentives — for example, rewarding early stopping as evidence of governance discipline rather than treating it as a failure — or separating the evaluation function from the investment function so that the people who judge value are not the same as the people who built the case.

What a workable exit discipline looks like

Exit governance does not require a separate bureaucratic process for every initiative. It requires three things that most organisations do not have.

Pre-agreed exit thresholds. At the point of investment approval, define the conditions under which the initiative will be reviewed for stop or redesign. These should be specific: a minimum adoption rate by a given date, a required delta on a named baseline metric, a maximum cost-per-outcome threshold. Generic conditions like "if performance is unsatisfactory" are useless because they leave the stop decision entirely to subjective judgment.

Named economic owners with real authority. Value ownership must be separated from delivery ownership. The person who owns the economic outcome must have the authority to say the outcome is not being achieved, and the institutional support to act on that judgment without sacrificing their position. If economic owners are politically unable to stop underperforming investments, the ownership is nominal.

Portfolio-level comparison rather than initiative-level self-assessment. The weakest initiatives look most defensible when evaluated only against their own original narratives. Portfolio-level review compares what a continuation decision costs against what the same capital would produce deployed elsewhere. This comparison is the only one that has real discipline because it makes the opportunity cost explicit.

The questions a board or investment committee should be asking

A governance body with serious accountability for AI economics should be able to answer these questions for every material investment in the portfolio:

What was the original return dimension and threshold, and has it been revised since approval?
What is the current evidence on that return dimension, referenced against the original baseline?
Who currently owns the economic outcome, and is that person actively engaged in achieving it?
What would the initiative need to demonstrate by the next review cycle to avoid a redesign or stop recommendation?
What is the total forward cost of continuing this initiative for another 12 months, and what is the expected value at that cost?

If the answer to any of these is "we don't know," the initiative has already left the domain of governance and entered the domain of hope. Hope is not a portfolio strategy.

A practical note on timing

The right time to establish exit criteria is before an initiative is approved. Pre-agreed thresholds are less politically fraught than retrospective ones because no specific investment has yet failed to meet them.

Organisations that try to install exit governance only after a portfolio of underperforming investments has accumulated will find the process much harder. The political costs of stopping multiple simultaneous investments are higher than the cost of an ongoing systematic process. Starting with a clean threshold-setting exercise at the point of new investment approvals, while separately conducting a structured portfolio review of existing investments, is a more viable sequencing.

The review of existing investments will still be difficult. Some of them should stop. That conversation will be uncomfortable. It will also be one of the most economically valuable governance actions the organisation can take.

When to Stop: The AI Initiative Autopsy

The asymmetry nobody talks about

What the economics of inaction actually look like

Three failure patterns that survive because they look like something else

The sunk-cost trap

The narrative substitution trap

The shared-platform alibi

Distinguishing governance immaturity from structural value failure

Rational persistence versus escalation of commitment

Continue criteria: when persistence is rational

Stop or pause criteria: when persistence becomes escalation

Portfolio stop metrics

The political economy of stopping

What a workable exit discipline looks like

The questions a board or investment committee should be asking

A practical note on timing

Optimist

Sceptic

The Optimist's Case

The Sceptic's Case

When to Kill an AI Project

The State of AI in 2023

What It Takes to Make AI Safe and Effective

Continue exploring

AI ROI Models

AI Economics Maturity Model

SPM & AI