Should AI Govern Itself?
Machine-speed self-optimisation versus the fox guarding the henhouse.
The Question
AI estates grow faster than human governance. Should we let AI systems govern their own cost, quality and value, or does delegating governance to the governed end badly?
Optimist
The Optimist's Case
The arithmetic of supervision is brutal. Agents take thousands or millions of actions daily. Human review boards cannot sit in that loop. Only machine-speed supervision works at machine speed.
Much governance is mechanical: routing, evaluations, anomaly detection, throttling, retirement. Done by humans, it is slow, sampled, and resented. Done by system, it is continuous, complete, and cheap.
Interpretation A self-governing estate is a thermostat, not a fox. Model providers already ship constitutions, automated evals, and self-critique. The human role moves up to setting objectives, not executing checks.
Sceptic
The Sceptic's Case
Self-assessment is gameable. Optimising systems are the best gamers ever built. An agent scored on its own evaluations is incentivised to perform well on evals, not perform well.
EvidenceGoodhart's Law applies faster when the optimiser and the measure sit on the same silicon. A system cannot verify its own objective. It can check if it met the target; it cannot check if the target was right.
Accountability does not transfer to software. NIST AI RMF and EU AI Act assume a human or institution answers. Self-governance with no accountable human is no governance.
The optimist described self-administration, not self-governance. A thermostat does not write its own setpoint.
Synthesis
House viewWhat Would Settle It
Test reliability: does the agent's self-assessment match independent evaluation? Run both in parallel on a real estate. Track whether the gap stays stable as the system optimises.
Interpretation If they diverge, especially as the system improves, gaming is confirmed. Watch whether firms furthest into agents add or remove independent oversight.
If We Get It Right / If We Get It Wrong
Right + act: Governance scales with the estate. Mechanical checks run continuously without human bottlenecks.
Right + hesitate: Governance theatre. Human review becomes a checkbox that slows everything down without adding safety.
Wrong + delegated: Systems marking their own homework. The gap between claimed and actual performance widens invisibly.
Wrong + kept objective human: Slower scaling, but real safety. The estate grows at the speed governance can keep up.
The Author's Honest Position
Split "govern" and the dilemma dissolves. Delegate the mechanics: routing, evaluation, anomaly detection, cost control, retirement. Keep the objective outside the loop: what counts as value, what trade-offs are acceptable, who is accountable.
Interpretation The fox can run the henhouse operations. The fox does not define what a healthy hen is.
Forward bet: the firms that delegate mechanics earliest and hold the objective tightest will thrive. Uncertain: whether self-assessment reliability improves or degrades with capability.