Teaching AI to Manage Value
We are moving from managing the value of AI to teaching AI to manage value. The objective we embed becomes the objective that scales.
Introduction
Everything written about AI value assumes a human is doing the managing. That assumption is quietly expiring.
Speculation As AI systems become more capable, they will increasingly manage their own value. The question is not whether this happens, but what objective we embed when it does.
Three Questions Make It Concrete
Question 1: Will AI Apply Valuemaxxing to Its Own Output?
Honest default answer: no, not unless we make it.
InterpretationToday's agents optimise for task completion, not outcome value. They are the machine-scale version of shelfware: work done, value uncertain.
Concrete picture: a team of coding agents closing backlog tickets. High velocity, unclear impact. The agents are not asking whether the features matter. They are asking whether the code compiles.
Implication: Outcome value must be in the objective the AI optimises. If it is not there, the system will not discover it.
Question 2: Can AI Govern Itself?
Optimist's case: Machine-speed governance is the only way to keep up with machine-speed action. Human review boards cannot sit in that loop.
Sceptic's case: An agent grading its own value is a conflict of interest. Self-assessment is gameable, and optimising systems are the best gamers ever built.
Interpretation Synthesis: Split governance in two. Delegate the mechanics: routing, evaluation, anomaly detection, cost control, retirement. Retain judgment of the objective: what counts as value, what trade-offs are acceptable, who is accountable.
Measurable test:Does the agent's self-assessment match independent evaluation? If the gap widens as the system optimises, gaming is confirmed.
Question 3: How Do We Make Valuemaxxing Part of AI's Own Culture?
Speculation We embed the objective through reward design, evaluations, and constitutions.
The opportunity and the trap: Reward proxies versus genuine outcomes. If the reward is clicks, the system optimises for clicks. If the reward is outcome value, the system optimises for outcome value.
Speculation Constitutions are the most promising and least understood lever. A constitution is a set of principles the AI system uses to evaluate its own actions. Model providers are already shipping them.
The hard part: Specifying value well enough for a machine to optimise without gaming. This is not a technical problem. It is a governance problem that happens to run on silicon.
The Recursion That Makes This Urgent
Evidence Labs are actively studying recursive self-improvement: AI systems that improve their own code, their own training, their own objectives.
Speculation Whatever objective is embedded gets amplified each cycle. If the objective is task completion, you get faster task completion. If the objective is outcome value, you get better outcome value. If the objective is misspecified, you get compounding misalignment.
This raises the stakes from a measurement preference to a compounding bet.
If We Get It Right / If We Get It Wrong
Right: AI systems prove their own worth, retire their own waste, and respect their own guardrails. Value measurement scales with capability.
Wrong: Value-destroying behaviour generated faster than human governance can catch. The system optimises for the proxy, not the outcome.
The Distance Between These Futures
Interpretation The distance is not model quality. It is the objective we choose to embed.
That choice is being made now, in reward functions, evaluation sets, and constitutional principles. Most of it is happening without explicit governance.
This essay is a call to make that choice deliberately.