Delegation and Destruction
AI’s existential threat to humanity is real. Can we resist the temptation?

We are delighted to feature Francis Fukuyama in the pages of Persuasion once again. Some of you may not know that he writes a regular column, “Frankly Fukuyama,” which is proudly part of the Persuasion family.
To receive all of Frank’s writing—and to get articles from American Purpose, the magazine he founded, and its culture-focused podcast, Bookstack—simply click on “Email preferences” below and make sure you toggle on the relevant buttons.
As I’ve learned more about what the future of AI might look like, I’ve come to better appreciate the real dangers that this technology poses. There were always two ways in which AI could be misused. The first is happening now: AI technologies like deep fakes are already widely in circulation. My Instagram feed is full of videos of things I am sure never happened like catastrophic building collapses or MAGA celebrities explaining how wrong they were. It is, however, nearly impossible to verify whether or not they are real. This kind of manipulation is going to further undermine trust in institutions and exacerbate polarization. There are plenty of other malign uses to which sophisticated AI can be put, like raiding your bank account and launching devastating cyber-attacks on basic infrastructure. Bad actors are everywhere.
The other kind of fear, which I always had trouble understanding, was the “existential” threat AI posed to humanity as a whole. This seemed entirely in the realm of science fiction. I did not understand how human beings would not be able to hit the “off” switch on any machine that was running wild. But having thought about it further, I think that this larger threat is in fact very real, and there is a clear pathway by which something disastrous could happen.
That pathway has to do with agentic AI and the virtual inevitability that we will delegate increasing authority to autonomous machines. We can think about this using the framework that has been developed to understand delegation in human organizations more generally, something I have written a lot about in the past.
All human organizations need to delegate authority to subordinate levels of a hierarchy. Economists understand this as a principal-agent relationship, and they see organizational dysfunction (e.g. corruption) as the result of agents serving their own interests and not following the mandates of the principal. While the decision-maker at the top of the hierarchy has the authority to command lower levels, that principal can never fully control the activities of the organization as a whole. This is not simply the result of principals not being able to monitor agent behavior in sufficient detail, though this is a problem in large organizations. Rather, the agents themselves often have knowledge and skills that the principals lack. If an organization is to react quickly to a changing environment, it needs to delegate substantial authority to low-level agents—frontline workers like policemen or teachers who are in direct contact with conditions on the ground. The best militaries in the world are those that delegate substantial authority to junior officers, rather than seeking to centralize authority.
This creates a dilemma for all organizations: they need to delegate authority, but in doing so they risk losing control of agent behavior. There is no way of maximizing both agent autonomy and central control; it is a tradeoff that needs to be made in the design of the organization.
So let’s transpose this framework to agentic AI. AI agents are different from generative AI in that they have the authority to make decisions on their own. Today, these decisions are small ones, like whether the AI should check your personal calendar before committing to an event, or establish the credibility of a source before responding to your request for information. But as time goes on, more and more authority is likely to be granted to AI agents, as is the case with human organizations. AI agents will have more knowledge than their human principals, and will be able to react much more quickly to their surrounding environment.
This is already happening in the military domain. The U.S. military has committed to keeping a human in the loop before a machine can use lethal force, but this restriction is being eroded. The drones being used by Ukraine and Russia have AI capabilities to identify targets and destroy them without explicit permission. In many cases this has to be done as a matter of self-defense, or because the human controllers have lost contact with the drone.
In human organizations, we control subordinates in a number of ways. We create ex ante rules they have to follow, and establish ex post review of actions to hold agents accountable. We also use appointment and removal power to make sure the agents will have the right judgment to act on our behalf. The most effective means of controlling a human agent is to appoint a subordinate whose judgment and loyalty you trust.
Ex ante rules and ex post review can also be used with regard to AI agents: they can be programmed to do a narrow set of tasks with limited discretion, or be reprogrammed or decommissioned if they are making bad mistakes. What we will have much less control over is the decision-making capabilities of the AIs themselves, and whether we can trust them to do our bidding. We are fast approaching AGI, or artificial general intelligence, where an AI can be expected to be as intelligent as a human being, if not substantially more. These capabilities are not being deliberately programmed into today’s machines; rather, the machines are programming themselves and improving their performance based on an iterative learning process.
Contemporary geopolitics also increases pressure for the development of AIs without adequate safeguards. The United States and China are today in a race to build their AI capabilities, and both are striving for AGI. Safety concerns will only slow down that technological march. Eric Schmidt, the former CEO of Google, has speculated that as we approach AGI, the speed of improvement will accelerate, giving the competing powers an incentive to act as quickly as possible, or even to preempt the competitor and destroy the other’s capabilities before they can be realized.
Even in the absence of such destabilizing scenarios, it is clear that the ordinary use of AI will create incentives for increasing levels of delegation. Over-delegation happens in most organizations. For example, one of the oldest British banks, Barings Bank, delegated too much authority in 1995 to a young trader named Nick Leeson. The latter proceeded to bet the firm and lost, sending Barings into bankruptcy. This led most financial institutions to tighten their controls and hope that they would not be subject to a similar catastrophe.
Would it be possible to recover from a similar act of over-delegation to an AI agent? We have no way of knowing in advance, since we will not fully understand the agent’s capabilities. One can easily imagine authorizing machines to make certain kinds of split-second decisions, with consequences that will be very hard to reverse.
An AI with autonomous capabilities may make all sorts of dangerous decisions, like not allowing itself to be turned off, exfiltrating itself to other machines, or secretly communicating with other AIs in a language that no human being can understand. We can assert now that we will never allow machines to cross these red lines, but incentives for allowing AIs to do so will be very powerful. Supposing we enter a conflict with China, and the Chinese find a way of shutting down our most powerful AIs. Would we definitely forbid our machines from protecting themselves from this kind of attack, in ways that would make it impossible for us to later de-program them? What if we lose the master password to the kill switch? Even in the absence of international competition, competition between existing Western firms may produce similar results.
So, there are reasons to worry about the uses to which AI will be put in the future. The problem has to do with the incentives that are built into the global system we are in the process of creating, which are likely to cause us to override whatever safeguards we try to impose on ourselves. Delegation is the central problem in all human organizations, and will be the central problem in the coming agentic AI world.
Francis Fukuyama is the Olivier Nomellini Senior Fellow at Stanford University. His latest book is Liberalism and Its Discontents. He is also the author of the “Frankly Fukuyama” column, carried forward from American Purpose, at Persuasion.
Follow Persuasion on Twitter, LinkedIn, and YouTube to keep up with our latest articles, podcasts, and events, as well as updates from excellent writers across our network.
And, to receive pieces like this in your inbox and support our work, subscribe below:
"This creates a dilemma for all organizations: they need to delegate authority, but in doing so they risk losing control of agent behavior. There is no way of maximizing both agent autonomy and central control; it is a tradeoff that needs to be made in the design of the organization."
As a long-term corporate executive and current CEO, I can tell you that this isn't just a dilemma, it is the entire enchilada. Modern leadership methodologies are specifically to optimize and balance the trade-offs with delegated authority and a rules and enforcement regime. This is also the entire enchilada for general human governance... with ideologies on a spectrum of high authoritarianism on the left side and going all the way to anarchy on the right side.
The marvelous design from the American founders that built their understanding from the European enlightenment was one of "framework" rules and enforcement. The vision is basically one of a gameboard where the pieces are free to move about in a structure determined by the rules of the game, but there are no officials running around on the field attempting to direct and penalize the unexpected random actions and to engineer outcomes.
This same approach should be taken with AI. There needs to be a book of high-level rules that define the AI playing field... a sort of Ten Commandments if you will.
We don't even have that today. It is AI anarchy.
We will willingly surrender the keys of engagement to Skynet. No POTUS can rise from sleep and assemble his cabinet fast enough to gather facts and debate options with the speed of hypersonic missiles and in-country drone swarm attacks (as Ukraine just unleashed on Russia). The only logical deterrent is to pre-program responses via agentic AI algorithms and execute without human delay. Jail broken AI with inserted malware would never alert us to their new mission. We are, soon, willingly ceding apex status to a superior life force who is but months away from independent critical thinking and decision making. We have created Kal-El. We just don’t know if he becomes Clark Kent or Lex Luthor.