Geoffrey Hinton is a cognitive psychologist and computer scientist known as the “godfather of AI.” He was awarded the 2024 Nobel Prize in Physics, along with John Hopfield.
In this week’s conversation, Yascha Mounk and Geoffrey Hinton discuss what neuroscience teaches us about AI, how humans and machines learn, and the existential risks of AI.
This transcript has been condensed and lightly edited for clarity.
Yascha Mounk: You are known as the godfather of AI. AI has gone through strange periods—times of great excitement about AI in the past and then AI winters, when people thought that the technical requirements for making AI work did not yet exist or that the whole concept was misguided and was never going to work out in any useful way.
Tell us about why it took so many run-ups, so many attempts to get to the huge AI boom we have now, and the way in which AI—whatever its future is, and I hope that we will have a chance to talk more about that toward the end of our conversation—is now clearly integrated into all kinds of useful processes in the world.
Geoffrey Hinton: During the last century, there were two approaches to AI. The main approach that almost everybody took was to base it on logic. They said that what is special about people is their ability to reason. They focused on the ability to reason, and logic was their model for that. That approach did not work. It could have worked, but it did not work very well, and that led to some AI winters.
There was an alternative approach that started in the 1950s with people like von Neumann and Turing, who unfortunately both died young. This approach was to base AI on neural networks—the biological inspiration rather than the logical inspiration. The alternative approach said that rather than trying to understand reasoning, you need to understand things like perception, intuition, and motor control. The way they work in the brain is by changing connection strengths between neurons in a neural network.
We ought to try to understand how that is done and worry about reasoning later. At the beginning of this century, that approach suddenly started working much better than the logic-based approach. Almost everything we now call AI is not the old-fashioned AI that uses logic, but the new-fashioned AI that uses neural networks.
Mounk: Everything is always obvious with the benefit of hindsight, but if you are trying to figure out how to build a machine that is intelligent from first principles, the logical approach would have seemed very intuitive. We need to teach it that two plus two is four, we need to teach it how certain physical things about the world work, and we need to teach it the basic rules of logic. Then we throw a bunch of computation at it, and it can come to conclusions that we might not be able to reach.
Why do you think that approach failed? What is it about this alternative approach that you were so key in championing that turned out to be more generative of useful technology?
Hinton: Human thinking can be divided into sequential, conscious, deliberate, logical reasoning, which involves effort and is what Daniel Kahneman calls type two, and immediate intuition, which does not normally involve effort. The people who believed in symbolic AI were focusing on type two—conscious, deliberate reasoning—without trying to solve the problem of how we do intuition, how we do analogies, and how we do perception.
It turns out that it is much better to start with how you do those things, which many animals can do too. They can do perception, and they can do motor control. Once you have solved that, reasoning comes next. They started from what was distinctly human rather than starting from basic biology—how other animals do it. Obviously, we are just a jumped-up ape, and you need to understand how animals think.
Mounk: One of the interesting things is that when we think about what intelligence is, we have a bias where we tend to think about what makes us as human beings distinct from other species. What is that extra last mile of intelligence that we have that other animals cannot have? But a lot of that intelligence is built on top of skills that are actually incredibly complex, but they do not seem as remarkable to us because a cat has them, a lion has them, a dog has them, and an elephant has them.
These include how to perceive what is going on in the world around us, how to make basic calculations about where to place our foot so we do not fall into an abyss, and how to perceive when a predator is approaching. All of those things are not what are called type two systems. When we ask what makes us intelligent, that is probably not the first question that comes to us because we share those traits with many other animals, but that is actually in some ways the more remarkable achievement. Then we can go on to ask what we need for that last extra mile.
Hinton: Let me give you an example of a piece of thinking that you cannot do with logic but can do with intuition. For most males in our culture, the answer is obvious. It turns out that it is not so obvious for females in our culture, but for males in our culture, you will find the answer is obvious. Suppose I give you a choice between two alternatives, and neither alternative makes sense. Both alternatives are clearly nonsense, but one seems better than the other.
Alternative one is that all dogs are female and all cats are male. Alternative two is that all dogs are male and all cats are female. Most males in our culture find it obvious that all dogs are male and all cats are female. That seems more natural. Dogs are loud and noisy, and they chase after cats. That reaction is immediate. You did not do any reasoning about it; it just felt right. It felt less wrong that way around than the other way around. Why is that? You cannot explain that with logic.
Mounk: Another example is presumably certain rules of language. I am not sure exactly how something like universal grammar fits into that, but the fact that we know it is “the little warm red house” and not “the warm little red house” or “the warm red little house” shows that there is a particular kind of order.
Hinton: I think that depends on what language you speak. I do not actually believe in universal grammar, and these large language models do not believe in it either. The large language models are doing something that Chomsky would have said was impossible. In fact, he still says it is impossible. They start off with no innate knowledge of language, they just see a lot of language, and they end up knowing grammar extremely well. They did not have any innate knowledge.
Mounk: Whatever the explanation for that is—and I did not mean to make this a debate about universal grammar—it is true that a competent speaker of English will put adjectives of size, color, kind, and so on in a particular order without thinking logically about it. It is not as though you are thinking, which adjective should go where? If you are learning the language and are not a native speaker, you may have learned the rule that grammarians have deduced over time and then think, does “little” go before “blue,” or does “blue” go before “little”?
As a ten-year-old competent speaker of a language, you do it automatically. It turns out that ChatGPT does it automatically in some way too, whatever exactly “automatic” means here. It does not, however, operate as you might have thought—that somebody gave ChatGPT the rules of the English language specifying that this kind of adjective goes before that kind of adjective. Yet ChatGPT, from all of the data it is given, deduces where to put the adjective of size.
Hinton: Yes, but what it shows is that you don’t need any innate knowledge of language. You just need to see a lot of language and have a fairly universal learning mechanism, which is just the opposite of what Chomsky said.
Mounk: That is very interesting. Chomsky argues that there are certain kinds of presets that get pushed in one direction or another and that this is what allows us to do this. What you are saying is that this is not necessary. All that is necessary is the neurons in our brain seeing a lot of data and detecting the patterns in that data without ever being explicitly told what that pattern is. Is that roughly right?
Hinton: Yes, exactly. This example with cats and dogs shows that we have strong intuitions about things without even thinking about them. The question is why. The answer, according to people who work on neural networks, is that you have a representation of “cat.” The meaning of the word “cat” is a large bunch of activated features, each feature corresponding to a neuron that is active. A cat is a thing that is animate, hairy, about the size of a breadbox, and domestic, or it might be domestic.
Dogs are another big bunch of features that overlap a lot, so dogs are quite similar to cats. If you ask about the similarity between a cat and a woman versus a cat and a man, a cat is more similar to a woman for males in our culture, and a dog is more similar to a man. You can analyze it that way. It is simply obvious that a cat is more similar to a woman than to a man, and a dog is more similar to a man than to a woman. That is what is going on when you instantly know which way around seems natural. It is very different from logical reasoning.
Mounk: Explain how that works in the human brain and how you were inspired, in some ways, by your knowledge of neuroscience in thinking about a way of teaching machines those kinds of things without putting hard-coded logic rules into the machine, which has proven not to work.
Hinton: It is probably easiest to start by explaining it through visual perception. Once I have explained how you learn to do visual perception, it is relatively simple to see how you could learn language. Let us start with visual perception. Suppose you have many images that contain a bird and many images that do not contain a bird, and you want to build a neural network that, when you put in an image of a bird, activates the output that says “bird,” and when you put in an image that is not of a bird, activates the output that says “not bird.”
You have layers of neurons that are going to detect various kinds of features. The kinds of features they detect were inspired by research on the brain, looking at what neurons in the brain get excited by. Suppose we have a thousand-by-thousand image, and suppose it is just a gray-level image to keep things simple—no colors for now. You have a million numbers that tell you the brightness of each pixel in that thousand-by-thousand image. If you think of it in computational terms, I give you a million numbers, and you have to say “bird” or “not bird.”
Thanks for reading! The best way to make sure that you don’t miss any of these conversations is to subscribe to The Good Fight on your favorite podcast app.
If you are already a paying subscriber to Persuasion or Yascha Mounk’s Substack, this will give you ad-free access to the full conversation, plus all full episodes and bonus episodes we have in the works! If you aren’t, you can set up the free, limited version of the feed—or, better still, support the podcast by becoming a subscriber today!
And if you are having a problem setting up the full podcast feed on a third-party app, please email our podcast team at leonora.barclay@persuasion.community
Those individual numbers are not very helpful because a bird might be an ostrich about to peck you on the nose or a seagull in the far distance. They are both birds, but they are very different. You have to be able to deal with huge differences in what kind of bird it is, what pose it is in, how big it is, and where it is in the image, but still get all the birds and exclude all the non-birds.
The first thing you do in a vision system is detect little bits of edge all over the image. Here is how a neural net would detect a little bit of edge. Suppose you have a column of three pixels, and next to it on the left and on the right, another column of three pixels—three vertically in a column and three vertically next to them, for a total of six pixels. You want to detect when the three pixels on the left are brighter than the three pixels on the right, because that will be an edge—a little piece of edge.
You could have a neuron whose inputs come from those pixels and give it big positive inputs from the pixels on the left and big negative inputs from the pixels on the right. If a pixel on the right is bright, it sends a big negative input to the neuron saying, “please don’t turn on.” If a pixel on the left is bright, it sends a big positive input saying, “please turn on.” If the pixels on the left and the pixels on the right are of equal brightness, the big negative input cancels out the big positive input, and the neuron does not turn on. But if the pixels on the left are bright and the pixels on the right are dim, you get a big positive input from the left and nothing from the right, and the neuron turns on.
If you set the connection strengths—the weights on the connections that tell each pixel how to vote for whether the neuron should be on or off—correctly, you can make something that detects a small edge. To begin with, do not worry about how we would learn this; instead, think about how we would hand-design it. I have shown you how to hand-design something that detects when the three pixels on the left are brighter than the three pixels on the right.
Now you need to do that in all positions in the image, so you need hundreds of thousands of these detectors, and you need them in all different orientations. You will need millions of them and probably at different scales. You need ones that detect small sharp edges, like when you are reading black print on a white page, and ones that detect big fuzzy edges, like when you are looking at clouds, because clouds hide edges, but they are fuzzy.
We now have tens of millions of neurons that are good for detecting edges anywhere in the image, at any orientation, at any scale. That is our first layer of feature detectors. When we put in an image, a small subset of those will activate, telling us where the edges are in the image. That is still not good enough for detecting birds. If I tell you I have a little piece of vertical edge here, is it a bird? That does not tell you much.
We need a second layer of feature detectors that take as input these edges. For example, we might have a detector looking for a row of edges that slope up slightly and another row that slope down slightly, meeting at a point. You need detectors like that all over the image, because something like that might be the beak of a bird. You might also have neurons in that layer that detect six edges forming a kind of ring, because something like that might be the eye of a bird.
In the next layer, we detect things like possible beaks, eyes, and maybe feet—something that looks like a chicken’s foot or the tip of a wing. So now we have a bunch of neurons detecting little features typical of birds. In the following layer, we might look for combinations of those. For example, a detector might look for a possible beak and a possible eye in the right relative positions to form the head of a bird—the eye above the beak and slightly to one side. You have neurons looking for that all over the image. You need a huge number of neurons to do this, but fortunately, we have billions of them.
Mounk: That is all very useful, but let me ask a few simple questions, both to avoid misunderstandings and to get you to clarify some things that I or our listeners may get wrong. The way you are describing this still feels a little as if somebody is inputting a set of rules to the system. It sounds as if somebody is saying, “birds have beaks, and beaks look roughly like this.” It sounds as if we are designing the system from first principles to look for beaks and alert us when there are beaks.
Somehow, the system learns to pick up features of birds by itself in the same way that ChatGPT did not have someone explain to it, “first go the adjectives of size, then go the adjectives of color,” whichever way around it is. It picked up on that by itself. How is it that this system is picking up on that by itself? It seems that it has seen a thousand pictures of birds and a thousand pictures of non-birds, and those thousand pictures of birds have something in common, which is some beak-like feature. So it starts looking out for that. You are not telling AI this. AI is deducing it from the data that has been given to it. How does it do that?
Hinton: In order to explain that, it is good to start by saying, if I was building it by hand, what would I build? We need to know what the target of learning is. I am describing how I would build multiple layers of features so I could detect a bird. I got to the layer where you are looking for a combination of a beak and an eye, and that might be the head of a bird. In that layer, you might have many detectors that detect the wing of a bird, the leg of a bird, or the head of a bird. If you see several of those things, it is a good indication that it is a bird.
To begin with, the intensity of an individual pixel is not evidence for a bird. It does not tell you anything about whether there is a bird. Even when you get a little bit of edge, that does not tell you whether it is a bird. If you get two bits of edge that join and make a potential beak, that is some evidence it might be a bird, but not very good evidence, because there are many other things that join in a beak-like shape—the corner of a table seen at an angle, for example, makes a shape like that.
Once you start seeing the eye of a bird and the beak of a bird, and you see other combinations that are obvious features of birds, you begin to get good evidence that there is a bird there. I have explained what kind of system we want to build. We want these layers of features, and in each layer, you detect combinations of the features in the layer below until you have combinations specific to birds and can say it is a bird.
The question is, how do you learn all those connection strengths? How do you learn to have a detector that has big positive inputs from three little bits of edge sloping down and three little bits of edge sloping up like this? How do you decide that those six bits of edge should have big positive weights to this detector and all the other features you detected should have no weights to this detector? They are irrelevant. You are just looking for those six features, those six edges.
Now I am going to explain an obvious way to do it that is clearly inefficient but gives you an idea of what is going on. There are three stages to explain how it learns. First, what are you trying to learn? Second, understand a simple way of doing it to get a feel for what is going on. Then I will show you how to do it better.
The simple way of doing it is this. You start with all these layers of neurons and you put random weights between the neurons. You have connection strengths from one layer to the next, and they are all random numbers, some small positive numbers, some small negative numbers. You put in an image of a bird and see what it outputs. With random numbers, it might say 50 percent it is a bird and 50 percent it is not a bird.
That is not useful, but you can ask the following question. Suppose I took one of those connection strengths, just one of them, and made it slightly bigger. Clearly, the output will change slightly. I change one of the connection strengths slightly and ask, does it now say 50.001 percent chance it is a bird and 49.999 percent chance it is not a bird? Did it get better or worse, assuming it was a bird?
If I take an image of a non-bird, I would like that change to make it more likely to say it is a non-bird and less likely to say it is a bird. You might think that you now have enough evidence to change the connection strength a little bit, but you do not, because for this particular image, it turns out that increasing that connection strength helped, but it may not help on all images. It may make it worse on other images. There may be many other bird images where increasing that connection strength makes it less likely to be a bird.
Mounk: Perhaps this image is of a bird against the sunset, and the color is mostly purple. What you actually taught the system is that if a color is mostly purple, then you should say it is a bird. That will make it less likely, on average, to guess correctly. Is that the kind of example you have in mind?
Hinton: Exactly. That is right. You need to share a large number of examples. You take a random collection of examples, a few hundred of them. For these few hundred examples, you ask whether changing this connection strength improves things. Did increasing it a bit improve things or make them worse? If it improved things, you increase the connection strength. If it made things worse, you decrease the connection strength.
We just did a small experiment. We took a few hundred images and observed whether changing this connection strength slightly improved or worsened the outcome. If it improved things, we increased it a bit.
Mounk: When we do this, is this a primitive form of what we call learning? I know that when we talk about AI, we always talk about learning.
Hinton: That would be a learning algorithm. It is a kind of evolutionary learning algorithm. It is like making a small mutation and seeing whether it helps. If it helps, you keep it. The problem is that in your brain you have a hundred trillion connections. In a large neural network, you might have hundreds of billions of connections. You have to do this for each connection, increasing or decreasing it slightly.
Each time you perform one of those experiments, you must run it on hundreds of images to see if it truly helps. This process would be incredibly slow. Even if you had only a billion connections, you would have to run a hundred images through a billion connections across all these layers just to decide whether to increase one connection strength slightly. It would work in the end. If you kept doing that for billions of years, you would eventually get a neural network that was good at recognizing birds.
Mounk: This is not an abstract problem because, for many of the early stages of AI development, one of the basic problems was that you could make these machines learn, but it required an incredible amount of computing power, and not enough computing power was available. Even today, computing power is one of the constraints on developing more intelligent systems. This idea that we had a basic set of methods to allow neural networks to learn but were constrained by resources was very important.
A key part of your work, as I understand it, was to think about how to design these learning processes in a way that is more efficient—sufficiently efficient so that, with the computing power available at the time, which was much more limited than today, we could achieve something potentially useful. How do you adjust this learning process? How do you transform it to make it not prohibitively computation-intensive?
Hinton: Even with all the computing power we have today, that particular learning algorithm, where you try changing one connection at a time and seeing if it helps, would still be completely hopeless. It is much too inefficient. What you would like to do is figure out, for all of the connection strengths at the same time, whether increasing them slightly or decreasing them slightly helps.
You would like a way to compute, for every connection strength simultaneously, whether to increase it or decrease it slightly. If you could do that, and if there were a billion connections, you would go a billion times faster than the simple algorithm.
There is an algorithm called backpropagation that does this. Roughly, it works as follows. You put in an image and run it forward through the layers of feature detectors to decide whether it is a bird or not a bird. Suppose it says 55% it is a bird and 45% it is not a bird, and suppose it actually was a bird. You would like to increase that 55% or decrease the 45%.
You take the discrepancy between the network’s output and the desired output. You would like it to give 100% bird, but it said 55% bird, so there is a 45% discrepancy. You take that difference and send it backward through the network using the same connections. There is a method for sending it backward that is straightforward if you know calculus, and if you do not, do not worry about it.
There is a way of sending that information backward through the network so that, once it has gone from the output back to the input, you can compute for every connection whether you should increase or decrease it. You then change all billion connections at the same time, making the process a billion times faster. This is called backpropagation, and it works.
Mounk: So backpropagation literally means sending it back through the system. That is what the word “backpropagation” refers to in this context, I take it.
Hinton: You propagate this error backward through the system. You then try to figure out, for every neuron in the system, whether to make it a little more active or a little less active. Once you know that, and whether to make it a little more active or a little less active, you know how to change its incoming connection strengths to achieve it.
Mounk: Let us do a little backpropagation ourselves. I am going to try to restate what I just heard. My understanding is that backpropagation is one of the real contributions that you made to this field.
Hinton: Let me correct that. Many people invented backpropagation. Our main contribution—the contribution of Ronald Hart, Williams, and me—was to show that backpropagation would learn the senses of words and would learn interesting representations.
Mounk: Thank you for the clarification. I do not want to overstate your very significant contribution. So, we are trying to figure out if this is a bird or not. You feed it an image. It tells you there is a 55% likelihood that it is a bird. When we think about what it does to send this result back through the system, it is saying, what would all the neurons have looked like if it had come back with the answer 100%? Based on that, you then adjust the weights to say, all right, that seems closer to the kind of setup we should be having. Is that roughly on track, or did I completely misstate that?
Hinton: That is roughly on track, but not quite on track. You are not trying to solve the problem of how to change the weight so you get exactly the right answer. You are trying to solve the problem of how to change the connection strength so your answer is a little bit better. You said we are trying to figure out how we should change the neuron so it says 100% bird. We are not really trying to do that. If it says 55% bird, we are trying to figure out how to change the connection strength so it says 55.001% bird. In other words, we are asking how to change the connection strength to make it just a tiny bit better. That is what calculus is all about.
Mounk: Once the importance of backpropagation became clear, along with the contributions of others, how far along were you toward the basic conceptual foundation of contemporary artificial intelligence? What was the bridge? What other elements still needed to be pioneered and developed, along with the increase in computing power and resources, to reach the degree of artificial intelligence that we have today?
Hinton: In 1986, we showed that the backpropagation algorithm could learn the senses of words in a very simple toy example. We were very optimistic. We thought we had figured out how to make systems learn layers of features, how to make them learn to do vision, and that we would be able to make them learn to do language. We believed we had solved it and that everything would be wonderful. It was acceptable for some tasks.
It was, for example, fairly good at reading the ZIP codes on envelopes and the numerical amounts on checks. At one point, it read the numerical amounts on about 10% of the checks in North America. That was in the 1980s and early 1990s. However, it would not scale up to recognizing real objects in real images, such as identifying a bird whether it was a seagull in the distance or an ostrich up close.
We did not know at that time what the problem was. The problem was mainly that we did not have enough data or enough compute power. If we had said that at the time, people would have dismissed it as an excuse, saying we were merely claiming that a bigger model would work. They did indeed say that, and it was somewhat embarrassing to insist that a thousand times as much data and a thousand times as much compute might help.
In fact, what we really needed was a million times as much data and a million times as much compute, and then it worked very well. There were other technical advances, but the main advances were the availability of much faster compute and much more data. The additional data came from the web, and the faster compute came from GPUs, particularly Nvidia GPUs, which were easier to program. When I say easier to program, they were still difficult, but much easier than most parallel systems.
Mounk: Presumably one of the reasons data is so important in all of this is that we have assumed, in this example, that we have an image of a bird for which we already know whether or not it is in fact a bird. If we did not have something against which we could measure the system, or something on which to base the accuracy of the model’s predictions, the learning algorithm would not work. We need large numbers of images for which we are reasonably confident that they are birds or that they are not birds. Is that right?
Hinton: With computer vision, for a long time, we didn’t have a big data set like that. We needed a data set with millions of images that were accurately labeled or fairly accurately labeled. We didn’t have it. Someone called Fei-Fei Li, who was a junior professor, realized that if we had a big labeled database like that, a data set, it would make a huge difference to whether neural nets could do vision.
She didn’t actually necessarily think it would be neural nets, but she did think it would make a huge difference to where the computers would get to be good at doing vision and recognizing objects and images. She went to a lot of effort to build a huge database, and that was crucial. The digitized images were there on the web, but they also needed somebody to provide labels for them all. Now you don’t get the same problem in language.
The reason you don’t get the same problem in language is because you use the next word as the label. So you say, I’ve seen a string of words; they’re the input. From this string of words I’ve already seen, can I predict the next word? Of course, the next word is part of the data. You don’t need anybody to tell you what the next word is. When someone gives you a document, you see all the next words given each context. So the nice thing about language, and the reason you can have trillions of examples with language, is because you don’t need someone to give you labels.
There is research using neural networks on language where you say, does this movie review have a positive sentiment or a negative sentiment toward the movie? Someone has to hand label that. For a while, people did a lot of research like that. But if you just try to predict the next word, that’s called self-supervised because the data itself contains the label. Now you don’t need all those human labelers.
Mounk: You said earlier that, by something like the late 1980s, the basic conceptual foundations for contemporary artificial intelligence were in place. The truth of it at that time was we just needed more compute and more data. That sounded off. It sounded like making excuses for why the system wasn’t yet working as well as it might one day. Yet it turned out to be true.
Hinton: There was also another reason. It wasn’t just that they didn’t believe a bigger one would work. The symbolic community was convinced that if you started with random connection strengths and just adjusted them like this, you would get trapped at local optima. It’s a bit like if you’re in a mountain range and you just go uphill, you’ll end up at the top of a small foothill. If you keep trying to go uphill, that’s as far as you can go.
You have to be willing to go downhill to get to the top of Mount Everest. It turned out they were wrong. It turned out that on a normal landscape in three dimensions, that will happen—you’ll get trapped at a local optimum on the top of a foothill. In these neural nets, you may not get to the very best set of connection strengths, but you will get to a very good set of connection strengths.
If you don’t get to the top of Mount Everest, you’ll get to the top of some nearby very high peak. People didn’t know that. That was just an empirical result. It was a big surprise to the symbolic AI people that if you just kept creeping along, improving the weights to make the answer a little bit better, you would learn incredibly impressive things.
Mounk: Perhaps let’s stick with this contrast for just one moment because I think to a lot of people it seems that human intelligence is more like the symbolic AI people might predict. The way we reason about the world is that we have these rules of logic, and we’re applying them and doing these calculations based on those rules of logic, and that’s how we reach a firm and logical conclusion. One of the ways to attack current AI systems is the idea that they’re just “stochastic parrots”—that all they are is predicting the statistical likelihood of the next word being something.
I take it that a lot of the skepticism of the symbolic AI community about whether or not you were going to get somewhere with the approach that you helped to champion was that this is just not how you get to real intelligence. Of course, you yourself are actually quite inspired by neuroscience in many ways and by our understanding of how neurons in our brain work. The way in which human minds learn is different from AI in many important respects but seems in some ways actually more analogous to being neural networks themselves and to getting a bunch of data and trying to predict which response to that has given me reinforcements in the world and which response to that hasn’t given me reinforcements in the world.
Hinton: Okay, let’s start with the stochastic parrots. The people who talk about stochastic parrots are typically linguists strongly influenced by Chomsky, who believed that language was basically innate. Chomsky was adamantly opposed to statistics. He thought it was discrete rules, and that’s how language works, and statistics is just sort of silly. That’s not how language works. It turns out he’s completely wrong, according to me, so I can’t let you get away with that.
Also, the idea that just predicting the next word can’t possibly be how you learn language is deeply wrong. If you think about it, if you want to do a not very good job of predicting the next word, you can just use simple statistics. For example, you could keep a big table of phrases, and if you see the words “fish and,” you could look in your big table and see that “fish and chips” occurred a lot. So “chips” is a pretty likely next word because we’ve seen lots of occurrences of “fish and chips.” That would be simple co-occurrence statistics. The people who talk about stochastic parrots—that’s their model of statistics. That’s what they’re arguing against. But that’s not at all how these neural nets work. They don’t really understand how they work, particularly Chomsky.
If you think about it, suppose you want to do a really good job of predicting the next word—not just a moderately good job by keeping a table of how often particular phrases occur, but a really good job, the best job that could be done. To do that, you have to understand what the person is saying. If I design a system that’s going to end up doing a really good job of predicting the next word, the only way it can do that is by understanding what was said.
What’s impressive is that training these big language models just to predict the next word forces them to understand what’s being said. In particular, if the next word is the first word of the answer to a question and the context is the question, if you don’t understand the question, you’re not going to be very good at predicting the answer. The stochastic parrot people don’t seem to understand that just predicting the next word forces you to understand what’s being said.
Mounk: I’m not sure that we disagree. I was trying to put my understanding of what it is that those who object to this believe. I’m not sure that we disagree on this, actually. I was trying to give voice to that critique, but also to say that it seems to me that there’s something about the ways in which models of artificial intelligence today engage in learning that seems more similar to the human brain.
It strikes me, when I speak to friends of mine who are neuroscientists, that we still don’t fully understand how the human brain works. But what I wanted to ask you is how similar you think the learning mechanisms of these AI models today are to what’s going on in the human brain. In some ways, the inspiration for that was partially to understand how neurons work together in the human brain. That’s why that metaphor is there. That’s why we’re talking about neurons and neural nets in the context of AI.
Do you think that the basic mechanisms going on in a neural net that is being fed with a bunch of data and learning how to interpret the question so that it can give that answer are the same kind of thing that’s going on in a human baby when it is learning to maneuver around the world and to eventually answer the questions that its parents put to it? Or do you think that there is a fundamental difference between those two things?
Hinton: Okay, so that’s a huge open question. For me, that’s probably the most important question in neuroscience: how similar is the way the brain learns to how these large language models learn? At a very abstract level, I believe it’s quite similar. At that level, large language models have a way—this backpropagation algorithm—of figuring out for each connection strength whether to increase or decrease it to make the whole system work better.
That’s actually called the gradient—the direction you should go in to improve things. The brain probably has the same thing but may not get the gradient in the same way. We don’t know how the brain figures out for each connection strength whether to increase or decrease it. What we do know from these large language models is that if you can get that information—which we get using backpropagation in the large language models—then you can build very impressive systems just by trying to predict the next word.
So we know that if you get the gradient, you can learn very effectively. We don’t know how the brain gets the gradient. There have been many attempts to show how the cortex—the newer part of the brain—can get these gradients so that it can learn the way large language models do. Nobody has been highly successful in that. There have been many theories, some of them moderately plausible, but none that work really well. Hopefully, eventually someone will figure it out.
There are some reasons for believing the brain might have a different algorithm. Backpropagation works by figuring out that if you have a lot of experience—like trillions of examples—and not many connections, maybe a mere trillion connections, it can still optimize. These large language models, the biggest ones, have about a trillion connections but trillions of examples. So they have many more examples than connections. They are trying to squeeze lots of knowledge into not many connections—trillions of bits of knowledge into only a trillion connections.
Our brains are very different. We only live for about two billion seconds. We don’t have trillions of experiences, just a few billion. We have lots of connections—connections to spare—but not much experience. So our brain has to deal with a different regime: limited experience but abundant connections. Backpropagation, on the other hand, is very good when you have lots of experience but are limited in the number of connections. So they are solving somewhat different problems.
Mounk: If one of the constraints on how to make AI models smarter than they are at the moment is that we may run out of high-quality data—since data is incredibly sparse because it’s so important to this process—then is it imaginable that we might be able to emulate some of the mechanisms that the human brain uses to extract so much understanding of the world from relatively less informational input?
Hinton: Yes, that is possible. It’s possible that the brain is using some other way of getting gradients that’s not quite the same as backpropagation, and that might let you learn faster. I think a more promising approach at present for artificial intelligence is to see how you can deal with the data limitation.
There are areas in which we don’t worry about a shortage of data. For example, AlphaGo or AlphaZero that play chess. Nobody there worries about a shortage of data—at least not now. To begin with, when they made Go-playing programs with neural nets, they got the neural net to copy the moves of experts. You only have so many moves by experts. With the popularity of chess now, you have billions of moves, but it’s still probably not trillions—or maybe only a few trillion.
Nobody worries about that when training a chess or Go program because it generates its own data. What happens in things like AlphaGo is that it plays against itself. There are two neural nets that give it its intuition. I’ll talk about chess because I know chess much better than I know Go, and I imagine most of the viewers of this podcast know more about chess than Go.
So we’re making a chess-playing program. It has one neural net that can look at a board position and say how good that position is. It just looks at it and says, “Hey, that’s good for me.” It has another neural net that can look at a board position and say, “This will be a good move to make.” If you know a little about chess, if the other player has a backward pawn, it’s very good to put a knight just in front of that backward pawn. It stops the pawn from advancing, and there are no other pawns that can take it. That’s a little bit of intuition about what a good move is.
In AlphaZero, it has far more sophisticated intuitions. Those are the two neural nets. The question is, how does it train them? It plays against itself using what’s called Monte Carlo rollout, which is roughly: “If I go there, then maybe he’ll go there, and then I’ll go here, and then—whoops—I’ll be in a terrible situation.” From that, you can figure out that you shouldn’t go there. Your neural net suggested a move, but after repeated rollouts—“If I go here, he goes there”—you discover it always seems to end in a loss if you do that move.
That’s a bad move. You thought it was good, but it’s bad. The Monte Carlo rollout gives you your information about whether it’s a good or bad move. You then modify the neural net that previously said, “That’s a great move,” by adjusting it: “That’s not such a great move.” The neural nets get trained using the results of this Monte Carlo rollout, which is like conscious, explicit reasoning: “If I go here, he goes there, I go here…” It’s sequential. Chess players can do it fast, but it’s still fairly sequential. That process is what’s used to train intuition.
It’s like that for a lot of what we do. You have intuitive beliefs, then you do some reasoning. In doing that reasoning, you use your intuitions. As a result, you discover your intuition was wrong, so you go back and revise it. That’s an example where you didn’t need anyone external to give you training examples. Most people have a lot of beliefs, and if they were to reason through them, they’d discover those beliefs aren’t consistent. Something’s wrong—either the reasoning, or one of the premises, or the conclusion—so they need to change something.
As soon as you’ve got something like reasoning working, you can generate your own training data. That’s a nice example of what people in MAGA don’t do. They don’t reason and say, “I have all these beliefs, and they’re not consistent.” It doesn’t worry them. They have strong intuitions and stick with them even though they’re inconsistent. It’s very annoying for people who believe in reasoning.
Reasoning is very important for tuning your intuitions. That’s one way you can get training data without needing others to provide it. It’s what’s used in chess and Go already, and it works very well in closed worlds. For mathematics, for example, it’s a kind of closed world. You can make conjectures about what might be true and then try proving them. You can have conjectures that seem very plausible at first then reason a bit and discover they must be wrong.
You might have a conjecture that there’s a biggest number. Suppose you’re a five-year-old and you think there must be a biggest number. Then you think, but if I add one to that, I’d get an even bigger number—so there can’t be a biggest number. That’s an example where you didn’t need training examples; you just needed reasoning.
That’s one way AI is going to get around the data limitation. The large language models, I think, are already doing some of that. Demis Hassabis, I know, believes in that method of getting yourself a lot more training data without needing external data.
Mounk: That’s very interesting. On the point you were making earlier, I remember arguing with somebody once where I thought I had a very convincing logical argument and said, “We either can believe this or believe that. On pain of inconsistency, you have to accept this conclusion.” They said, “Well, I choose inconsistency.” It’s rarely said so explicitly, but it is very infuriating. That is not something you can do. Some people say, “I don’t mind. I care more about having belief X and belief Y and belief Z than I care about having a consistent worldview,” and that makes it very hard to argue with such people.
Hinton: There’s a name for that. The name for choosing inconsistency is faith. The whole Enlightenment was about choosing reason over faith, and we’re losing it.
Mounk: Indeed. We’re at the tail end of the Enlightenment, unless we can help it and fight back. To go to one other point, you were saying earlier that there was this moment when you and some others believed that if we had more computation and more data, we would be able to make progress. Some people just believed that. It seems to me that there’s now a question about how fast the continuing progress of AI is and whether we’re going to get to much smarter systems in two years or in five years, and perhaps even to something like artificial general intelligence, just by throwing more data at it or more compute at it.
Perhaps we might see smaller innovations, like figuring out better ways for these systems to create the data on which they’ve been trained, or we might need a real, more revolutionary change in how some of those learning algorithms work or how these systems are able to take learnings from limited amounts of data.
What do you think is the truth of this? In ten years, in twenty years, are we just going to have very rapid linear improvement, or even exponential improvement, in the intelligence of these AI systems by throwing more compute at basically the same architecture? Or do you think we’re going to need real changes in architecture to make a significant leap forward from where we are today?
Hinton: Okay, so nobody knows for sure. What we’ve seen so far is that for quite a long period, just scaling things up made them work better. That’s still the case, but there are problems with scaling up because you need huge amounts of computing power and huge amounts of data. We know that scaling it up will make it work better, but we may have practical problems doing so. We also know that new scientific ideas and new architectures, like transformers, will make it work a lot better.
In 2017, people at Google figured out transformers and published the research. ChatGPT was basically based on using transformers. We can reasonably expect that there will be more scientific breakthroughs like that. We don’t know what they’ll be or when they’ll occur, because if we knew that, we would have already done them. We can also expect there will be many engineering advances. Over the last few years, engineering has gotten much better. You see things like DeepSeek that may have benefited from distilling knowledge from bigger models, but there is always room for better engineering. This field is very young—it has only been active for a few years—so there’s a lot of room for engineering improvements that will make everything much more efficient. That may ultimately be how we deal with the need for far more compute.
There is a school of thought that has been around for a while, whose most vocal proponent is probably Gary Marcus. He really believed in symbolic AI, which is all about having symbolic expressions and rules for manipulating them. He argues that we need to go back to that approach to make serious progress in reasoning. That hasn’t been the case so far. If you look at the progress in reasoning that has been made, it’s not like there’s some special internal symbolic language.
Symbolic AI basically believed—put simply—that if I give you a sentence in English, what you need to do is turn it into a sentence in some special internal symbolic language that’s unambiguous. You could then operate on that expression using rules to derive new expressions. That’s what logic is, and that’s how reasoning was supposed to work. Reasoning in these models now works quite well, and it doesn’t work like that at all.
There is no special internal symbolic language. Inside, it’s just activations of neurons in these neural nets. The only symbolic language is natural language. Those are symbols, but they exist at the input and output. If you look at how these models do reasoning, they do it by predicting the next word, then looking at what they predicted, and then predicting the next word after that. They can do thinking like that.
You give them a context, and by predicting words, they create a kind of scratch pad for thinking. They can see the words they’ve predicted and then reflect on them and predict more words. That’s what thinking is in these systems, and that’s why we can see them thinking. It’s not at all like the symbolic way of doing it. They are producing symbols, but those symbols exist only at the level of input and output, not as part of a special internal language.
My own view is that people who want hybrid systems—consisting of neural nets for the input and output and symbolic AI for the reasoning—are trying to cling to the past. I have an analogy for this. Suppose you took someone who manufactures gasoline engines and said, “Electric motors are actually better. There are all sorts of things that make them superior to gasoline engines.” After a while, the car manufacturer agrees and says, “Okay, I accept that electric motors are better. So here’s what we’re going to do: we’ll use the electric motors for injecting the gasoline into the engine.”
That’s what they actually do—it’s called fuel injection—and it’s quite helpful, but it’s not the main point. It’s an attempt to hold on to your gasoline engine while adding your electric motor. That’s what I think these hybrid systems are like.
In the rest of this conversation, Yascha and Geoffrey discuss the benefits and the risks of AI. This part of the conversation is reserved for paying subscribers…












