Delegation and Destruction

AI’s existential threat to humanity is real. Can we resist the temptation?

Jun 13, 2025

A burning house following a Russian drone strike in Kharkiv on June 11, 2025, amid the Russian invasion in Ukraine. (Photo by Sergey Bobok/AFP via Getty Images.)

We are delighted to feature Francis Fukuyama in the pages of Persuasion once again. Some of you may not know that he writes a regular column, “Frankly Fukuyama,” which is proudly part of the Persuasion family.

To receive all of Frank’s writing—and to get articles from American Purpose, the magazine he founded, and its culture-focused podcast, Bookstack—simply click on “Email preferences” below and make sure you toggle on the relevant buttons.

Email preferences

As I’ve learned more about what the future of AI might look like, I’ve come to better appreciate the real dangers that this technology poses. There were always two ways in which AI could be misused. The first is happening now: AI technologies like deep fakes are already widely in circulation. My Instagram feed is full of videos of things I am sure never happened like catastrophic building collapses or MAGA celebrities explaining how wrong they were. It is, however, nearly impossible to verify whether or not they are real. This kind of manipulation is going to further undermine trust in institutions and exacerbate polarization. There are plenty of other malign uses to which sophisticated AI can be put, like raiding your bank account and launching devastating cyber-attacks on basic infrastructure. Bad actors are everywhere.

The other kind of fear, which I always had trouble understanding, was the “existential” threat AI posed to humanity as a whole. This seemed entirely in the realm of science fiction. I did not understand how human beings would not be able to hit the “off” switch on any machine that was running wild. But having thought about it further, I think that this larger threat is in fact very real, and there is a clear pathway by which something disastrous could happen.

That pathway has to do with agentic AI and the virtual inevitability that we will delegate increasing authority to autonomous machines. We can think about this using the framework that has been developed to understand delegation in human organizations more generally, something I have written a lot about in the past.

All human organizations need to delegate authority to subordinate levels of a hierarchy. Economists understand this as a principal-agent relationship, and they see organizational dysfunction (e.g. corruption) as the result of agents serving their own interests and not following the mandates of the principal. While the decision-maker at the top of the hierarchy has the authority to command lower levels, that principal can never fully control the activities of the organization as a whole. This is not simply the result of principals not being able to monitor agent behavior in sufficient detail, though this is a problem in large organizations. Rather, the agents themselves often have knowledge and skills that the principals lack. If an organization is to react quickly to a changing environment, it needs to delegate substantial authority to low-level agents—frontline workers like policemen or teachers who are in direct contact with conditions on the ground. The best militaries in the world are those that delegate substantial authority to junior officers, rather than seeking to centralize authority.

My ChatGPT Teacher

Francis Fukuyama

Apr 25

Read full story

This creates a dilemma for all organizations: they need to delegate authority, but in doing so they risk losing control of agent behavior. There is no way of maximizing both agent autonomy and central control; it is a tradeoff that needs to be made in the design of the organization.

So let’s transpose this framework to agentic AI. AI agents are different from generative AI in that they have the authority to make decisions on their own. Today, these decisions are small ones, like whether the AI should check your personal calendar before committing to an event, or establish the credibility of a source before responding to your request for information. But as time goes on, more and more authority is likely to be granted to AI agents, as is the case with human organizations. AI agents will have more knowledge than their human principals, and will be able to react much more quickly to their surrounding environment.

This is already happening in the military domain. The U.S. military has committed to keeping a human in the loop before a machine can use lethal force, but this restriction is being eroded. The drones being used by Ukraine and Russia have AI capabilities to identify targets and destroy them without explicit permission. In many cases this has to be done as a matter of self-defense, or because the human controllers have lost contact with the drone.

In human organizations, we control subordinates in a number of ways. We create ex ante rules they have to follow, and establish ex post review of actions to hold agents accountable. We also use appointment and removal power to make sure the agents will have the right judgment to act on our behalf. The most effective means of controlling a human agent is to appoint a subordinate whose judgment and loyalty you trust.

The Future of Warfare is Remote Controlled

Francis Fukuyama

Jan 8

Read full story

Ex ante rules and ex post review can also be used with regard to AI agents: they can be programmed to do a narrow set of tasks with limited discretion, or be reprogrammed or decommissioned if they are making bad mistakes. What we will have much less control over is the decision-making capabilities of the AIs themselves, and whether we can trust them to do our bidding. We are fast approaching AGI, or artificial general intelligence, where an AI can be expected to be as intelligent as a human being, if not substantially more. These capabilities are not being deliberately programmed into today’s machines; rather, the machines are programming themselves and improving their performance based on an iterative learning process.

Contemporary geopolitics also increases pressure for the development of AIs without adequate safeguards. The United States and China are today in a race to build their AI capabilities, and both are striving for AGI. Safety concerns will only slow down that technological march. Eric Schmidt, the former CEO of Google, has speculated that as we approach AGI, the speed of improvement will accelerate, giving the competing powers an incentive to act as quickly as possible, or even to preempt the competitor and destroy the other’s capabilities before they can be realized.

Even in the absence of such destabilizing scenarios, it is clear that the ordinary use of AI will create incentives for increasing levels of delegation. Over-delegation happens in most organizations. For example, one of the oldest British banks, Barings Bank, delegated too much authority in 1995 to a young trader named Nick Leeson. The latter proceeded to bet the firm and lost, sending Barings into bankruptcy. This led most financial institutions to tighten their controls and hope that they would not be subject to a similar catastrophe.

Would it be possible to recover from a similar act of over-delegation to an AI agent? We have no way of knowing in advance, since we will not fully understand the agent’s capabilities. One can easily imagine authorizing machines to make certain kinds of split-second decisions, with consequences that will be very hard to reverse.

An AI with autonomous capabilities may make all sorts of dangerous decisions, like not allowing itself to be turned off, exfiltrating itself to other machines, or secretly communicating with other AIs in a language that no human being can understand. We can assert now that we will never allow machines to cross these red lines, but incentives for allowing AIs to do so will be very powerful. Supposing we enter a conflict with China, and the Chinese find a way of shutting down our most powerful AIs. Would we definitely forbid our machines from protecting themselves from this kind of attack, in ways that would make it impossible for us to later de-program them? What if we lose the master password to the kill switch? Even in the absence of international competition, competition between existing Western firms may produce similar results.

So, there are reasons to worry about the uses to which AI will be put in the future. The problem has to do with the incentives that are built into the global system we are in the process of creating, which are likely to cause us to override whatever safeguards we try to impose on ourselves. Delegation is the central problem in all human organizations, and will be the central problem in the coming agentic AI world.

Francis Fukuyama is the Olivier Nomellini Senior Fellow at Stanford University. His latest book is Liberalism and Its Discontents. He is also the author of the “Frankly Fukuyama” column, carried forward from American Purpose, at Persuasion.

Follow Persuasion on Twitter, LinkedIn, and YouTube to keep up with our latest articles, podcasts, and events, as well as updates from excellent writers across our network.

And, to receive pieces like this in your inbox and support our work, subscribe below:

Discussion about this post

Frank Lee

Jun 13

"This creates a dilemma for all organizations: they need to delegate authority, but in doing so they risk losing control of agent behavior. There is no way of maximizing both agent autonomy and central control; it is a tradeoff that needs to be made in the design of the organization."

As a long-term corporate executive and current CEO, I can tell you that this isn't just a dilemma, it is the entire enchilada. Modern leadership methodologies are specifically to optimize and balance the trade-offs with delegated authority and a rules and enforcement regime. This is also the entire enchilada for general human governance... with ideologies on a spectrum of high authoritarianism on the left side and going all the way to anarchy on the right side.

The marvelous design from the American founders that built their understanding from the European enlightenment was one of "framework" rules and enforcement. The vision is basically one of a gameboard where the pieces are free to move about in a structure determined by the rules of the game, but there are no officials running around on the field attempting to direct and penalize the unexpected random actions and to engineer outcomes.

This same approach should be taken with AI. There needs to be a book of high-level rules that define the AI playing field... a sort of Ten Commandments if you will.

We don't even have that today. It is AI anarchy.

Expand full comment

Jerry Kaplan

Jun 14

Frank – I always love your work, but this is a little over the top. While your concern that Agentic AI, unconstrained and poorly managed, could cause a lot of damage is reasonable, but to say that “AI’s existential threat to humanity is real” is a wild exaggeration.

For starters, most people reading this headline will assume you’ve bought into the whole “AI is coming alive and is going to exterminate us” sci-fi narrative. The content of the article is more sober, but there are a number of practical barriers to extinction-level use of the technology. AI(s) don’t “want” anything, aren’t malicious or self-serving, and can’t go “rogue” in the common sense of the word. They can make mistakes with serious consequences (like self-driving cars running people over), but that’s true of many or most advanced technologies.

If I WANTED to build or use an AI system to destroy humanity, I very much doubt it would get anywhere anytime soon.

A much better way to think about AI is that it’s a tool – a powerful one to be sure – and the real danger comes from how people (mainly malicious actors) may use this tool for nefarious purposes.

It’s very likely that adversaries like China and Russia are already using this technology, or preparing or planning to use this technology, to forward their interests and interfere with ours. We are probably planning the same. But the same technology can, and should, be used to detect and protect against such attacks, we are hardly helpless. The attacker doesn’t necessarily have the upper hand.

The idea that there’s an “AI race” and one or these parties (including us) could have some durable advantage, or that only the most advanced systems represent a potential threat, is simply false. The threat is real now, with existing tech, and that threat will certainly increase, but it’s very unlikely to “kill us all” in some mysterious, futuristic way.

We’re in the middle of the (latest) AI hype cycle, so I strongly caution you against believing the nonsense that leading AI companies are falling all over themselves to promote.

Example: These companies are promoting the idea that they’ve build “reasoning” versions of their systems (and that theirs is the best, of course!), but this simply isn’t true. LLMs and Generative AI systems don’t reason in the conventional sense, and at least so far, it’s not at all clear how they could be improved in this regard. It’s laughably easy to trip them up.

Agentic AI is pretty much nowhere yet. Maybe you can add something to your calendar, but that’s not really news, and I haven’t seen anything general enough that it can come close to handling most reasonable administrative tasks reliably. Only a fool would authorize it to make arbitrary financial commitments, “sign” contracts, engage in negotiations, etc. This will come, but we’ll have plenty of time to test it out and decide if the risks are worth it.

The demos look great, as AI demos always have, but it’s going to take a long time – 10-20 years – for this technology to mature and be productively integrated into organizations and workflows, as with all previous waves.

AGI is little more than a mythical holy grail. There isn’t some boundary we’re in danger of breaching and then all bets are off. Progress in generative AI has slowed, not speeded up, even though the people building these systems are using AI as least as much as those working in other areas to assist and accelerate their work, this isn’t going to lead to “runaway improvement”. It’s more likely to approach some asymptotic limit of capability until/unless further breakthroughs occur. (And Gen AI doesn’t do “breakthroughs” very well.)

If you aren’t monitoring the community that’s more level-headed about this subject, I recommend Gary Marcus (https://garymarcus.substack.com/). But perhaps the most knowledgeable and thoughtful analysis of this topic – which I hardily suggest you read – is the book “AI Snake Oil” (https://www.amazon.com/Snake-Oil-Artificial-Intelligence-Difference/dp/069124913X) from two scholars at Princeton. They publish regular follow-ups which can get you up to speed more quickly at https://www.aisnakeoil.com/.

Please – exercise a little rhetorical restraint when discussing this important topic! 😊

Jerry Kaplan

Expand full comment

7 more comments...

No posts