A Brief History of Intelligence: Evolution, AI, and the Five Breakthroughs That Made Our Brains
Rate it:
Open Preview
5%
Flag icon
DNA had officially become life’s blueprint, ribosomes its factory, and proteins its product.
5%
Flag icon
Respiratory microbes differed in one crucial way from their photosynthetic cousins: they needed to hunt. And hunting required a whole new degree of smarts.
6%
Flag icon
What separates you from an earthworm is not the unit of intelligence itself—neurons—but how these units are wired together.
7%
Flag icon
Gastrulation, neurons, and muscles are the three inseparable features that bind all animals together and separate animals from all other kingdoms of life.
9%
Flag icon
There is another observation about bilaterians, perhaps the more important one: They are the only animals that have brains. This is not a coincidence. The first brain and the bilaterian body share the same initial evolutionary purpose: They enable animals to navigate by steering. Steering was breakthrough #1.
14%
Flag icon
Dopamine is not a signal for pleasure itself; it is a signal for the anticipation of future pleasure.
14%
Flag icon
Berridge proved that dopamine is less about liking things and more about wanting things.
14%
Flag icon
serotonin is the satiation, things-are-okay-now, satisfaction chemical, designed to turn off valence responses.
20%
Flag icon
The second breakthrough was reinforcement learning: the ability to learn arbitrary sequences of actions through trial and error.
21%
Flag icon
The signal on which the actor learns is not rewards, per se, but the temporal difference in the predicted reward from one moment in time to the next. Hence Sutton’s name for his method: temporal difference learning.
21%
Flag icon
And then on the tenth move you pull off some clever maneuver that turns the tide of the game; suddenly you realize you are in a far better position than your opponent. It is that moment where a temporal difference learning signal reinforces your action.
21%
Flag icon
But over time, with enough games, each refines the other until they converge to produce an AI system capable of making remarkably intelligent decisions. At least, that’s what happened in Sutton’s simulations. It wasn’t clear whether this would work in practice.
21%
Flag icon
At the same time that Sutton was working on TD learning, a young physicist by the name of Gerald Tesauro was working on getting AI systems to play backgammon. Tesauro was at IBM Research, the same group that would later build Deep Blue (the program that famously beat Garry Kasparov in chess) and Watson (the program that famously beat Ken Jennings in Jeopardy!). But before Deep Blue or Watson, there was Neurogammon. Neurogammon was a backgammon-playing AI system that was trained on transcripts of hundreds of expertly played backgammon games. It learned not through trial and error but by ...more
22%
Flag icon
Dopamine is not a signal for reward but for reinforcement. As Sutton found, reinforcement and reward must be decoupled for reinforcement learning to work. To solve the temporal credit assignment problem, brains must reinforce behaviors based on changes in predicted future rewards, not actual rewards.
22%
Flag icon
In 1997 Dayan and Montague published a landmark paper, coauthored with Schultz, titled “A Neural Substrate of Prediction and Reward.” To this day, this discovery represents one of the most famous and beautiful partnerships between AI and neuroscience. A strategy inspired by how Sutton thought the brain might work turned out to successfully overcome practical challenges in AI, and this in turn helped us interpret mysterious data about the brain. Neuroscience informing AI, and AI informing neuroscience.
23%
Flag icon
dopamine was transformed from a good-things-are-nearby signal to a there-is-a-35 percent-chance-of-something-awesome-happening-in-exactly-ten-seconds signal. Repurposed from a fuzzy average of recently detected food to an ever fluctuating, precisely measured, and meticulously computed predicted-future-reward signal.
23%
Flag icon
Both disappointment and relief are emergent properties of a brain designed to learn by predicting future rewards. Indeed, without an accurate prediction of a future reward, there can be no disappointment when it does not occur. And without an accurate prediction of future pain, there can be no relief when it does not occur.
23%
Flag icon
Indeed, you can train vertebrates, even fish, to perform arbitrary actions not only with rewards and punishments but also with the omission of expected rewards or punishments.
23%
Flag icon
A nematode, on the other hand, cannot learn to perform arbitrary behaviors through the omission of rewards. Even crabs and honeybees, who independently evolved many intellectual faculties, are not be able to learn from omission of things.*
23%
Flag icon
Bacteria, animals, and plants all have circadian clocks to track the cycle of the day. But vertebrates are unique in the precision with which they can measure time. A verterbate can remember that one event occurs precisely five seconds after another event. In contrast, simple bilaterians like slugs and flatworms are entirely unable to learn the precise time intervals between events.
23%
Flag icon
TD learning, disappointment, relief, and the perception of time are all related. The precise perception of time is a necessary ingredient to learn from omission,
23%
Flag icon
The Basal Ganglia My favorite part of the brain is a structure called the basal ganglia. For most brain structures, the more one learns about them, the less one understands them—simplified
23%
Flag icon
The basal ganglia is wedged between the cortex and the thalamus
23%
Flag icon
The input to the basal ganglia comes from the cortex, thalamus, and midbrain, enabling the basal ganglia to monitor an animal’s actions and external environment.
23%
Flag icon
The basal ganglia is thereby in a perpetual state of gating and ungating specific actions, operating as a global puppeteer of an animal’s behavior. The functioning of the basal ganglia is essential to our lives.
23%
Flag icon
This symptom of Parkinson’s disease primarily emerges due to disruption of the basal ganglia, leaving it in a perpetual state of gating all actions, thereby depriving patients of the ability to initiate even the simplest of movements.
23%
Flag icon
the basal ganglia also receives input from a cluster of dopamine neurons. Whenever these dopamine neurons get excited, the basal ganglia is rapidly flooded with dopamine; whenever these dopamine neurons are inhibited, the basal ganglia is rapidly starved of dopamine.
23%
Flag icon
The basal ganglia learns to repeat actions that maximize dopamine release. Through the basal ganglia, actions that lead to dopamine release become more likely to occur (the basal ganglia ungates those actions), and actions that lead to dopmaine inhibition become less likely to occur (the basal ganglia gates those actions).
23%
Flag icon
The basal ganglia is, in part, Sutton’s “actor”—a system designed to repeat behaviors that lead to reinforcement and inhibit behaviors that lead to punishment. Remarkably, the circuitry of the basal ganglia is practically identical between a human brain and a lamprey fish brain, two species whose shared ancestors were the first vertebrates over 500 million years ago.
24%
Flag icon
Reinforcement learning emerged not from the basal ganglia acting alone, but from an ancient interplay between the basal ganglia and another uniquely vertebrate structure called the hypothalamus, which is a small structure at the base of the forebrain.
24%
Flag icon
so, in some ways, the basal ganglia is a student, always trying to satisfy its vague but stern hypothalamic judge.
24%
Flag icon
The hypothalamus is the decider of actual rewards;
24%
Flag icon
One leading theory of basal ganglia function is that these parallel circuits are literally Sutton’s actor-critic system for implementing temporal difference learning. One circuit is the “actor,” learning to repeat the behaviors that trigger dopamine release; the other circuit is the “critic,” learning to predict future rewards and trigger its own dopamine activation.
24%
Flag icon
the basal ganglian student initially learns solely from the hypothalamic judge, but over time learns to judge itself, knowing when it makes a mistake before the hypothalamus gives any feedback.
24%
Flag icon
TD learning, the wiring of vertebrate basal ganglia, the properties of dopamine responses, the ability to learn precise time intervals, and the ability to learn from omissions are all interwoven into the same mechanisms for making trial-and-error learning work.
24%
Flag icon
But this time was different—our vertebrate ancestor would remember the smell of that dangerous arthropod; she would remember the sight of its eyes peeking through the sand. She wouldn’t make the same mistake again. Sometime around five hundred million years ago, our ancestor evolved pattern recognition.
25%
Flag icon
Early vertebrates could recognize things using brain structures that decoded patterns of neurons. This dramatically expanded the scope of what animals could perceive. Within the small mosaic of only fifty types of olfactory neurons lived a universe of different patterns that could be recognized. Fifty cells can represent over one hundred trillion patterns.*
25%
Flag icon
The above type of learning, in which a network is trained by providing examples alongside the correct answer, is called supervised learning (a human has supervised the learning process by providing the network with the correct answers). Many supervised learning methods are more complex than this, but the principle is the same: the correct answers are provided, and networks are tweaked using backpropagation to update weights until the categorization of input patterns is sufficiently accurate. This design has proven to work so generally that it is now applied to image recognition, natural ...more
25%
Flag icon
the brain does not do supervised learning—you are not given labeled data when you learn that one smell is an egg and another is a strawberry. Even before children learn the words egg and strawberry, they can clearly recognize that they are different. Second, backpropagation is biologically implausible. Backpropagation works by magically nudging millions of synapses simultaneously and in exactly the right amount to move the output of the network in the right direction. There is no conceivable way the brain could do this. So then how does the brain recognize patterns?
26%
Flag icon
The next time a pattern shows up, even if it is incomplete, the full pattern can be reactivated in the cortex. This trick is called auto-association; neurons in the cortex automatically learn associations with themselves. This offers a solution to the generalization problem—the cortex can recognize a pattern that is similar but not the same.
26%
Flag icon
Auto-association reveals an important way in which vertebrate memory differs from computer memory. Auto-association suggests that vertebrate brains use content-addressable memory—memories are recalled by providing subsets of the original experience, which reactivate the original pattern.
Ron Jenkins liked this
26%
Flag icon
However, computers use register-addressable memory—memories that can be recalled only if you have the unique memory address for them. If you lose the address, you lose the memory.
Ron Jenkins liked this
26%
Flag icon
Register-addressable memory enables computers to segregate where information is stored, ensuring that new information does not overwrite old information. In contrast, auto-associative information is stored in a shared population of neurons, which exposes it to the risk of accidentally overwriting old memories. Indeed, as we will see, this is an essential challenge with pattern recognition using networks of neurons.
Ron Jenkins liked this
26%
Flag icon
when you train a neural network to recognize a new pattern or perform a new task, you risk interfering with the network’s previously learned patterns. How do modern AI systems overcome this problem? Well, they don’t yet. Programmers merely avoid the problem by freezing their AI systems after they are trained. We don’t let AI systems learn things sequentially; they learn things all at once and then stop learning.
Ron Jenkins liked this
26%
Flag icon
As of this book going to print, even ChatGPT, the famous chatbot released by OpenAI, does not continually learn from the millions of people who speak to it. It too stopped learning the moment it was released into the world. These systems are not allowed to learn new things because of the risk that they will forget old things (or learn the wrong things). So modern AI systems are frozen in time, their parameters locked down; they are allowed to be updated only when retrained from scratch with humans meticulously monitoring their performance on all the relevant tasks.
26%
Flag icon
even early bilaterians learned continually; the connections between neurons were strengthened and weakened with each new experience. But these early bilaterians never faced the problem of catastrophic forgetting because they never learned patterns in the first place. If things are recognized in the world using only individual sensory neurons, then the connection between these sensory neurons and motor neurons can be strengthened and weakened without interfering with each other. It is only when knowledge is represented in a pattern of neurons, like in artificial neural networks or in the cortex ...more
Ron Jenkins liked this
26%
Flag icon
even fish avoid catastrophic forgetting fantastically well. Train a fish to escape from a net through a small escape hatch, leave the fish alone for an entire year, and then test it again. During this long stretch of time, its brain will have received a constant stream of patterns, learning continually to recognize new smells, sights, and sounds. And yet, when you place the fish back in the same net an entire year later, it will remember how to get out with almost the same speed and accuracy as it did the year before.
26%
Flag icon
One theory is that the cortex’s ability to perform pattern separation shields it from the problem of catastrophic forgetting; by separating incoming patterns in the cortex, patterns are inherently unlikely to interfere with each other. Another theory is that learning in the cortex selectively occurs only during moments of surprise; only when the cortex sees a pattern that passes some threshold of novelty are the weights of synapses allowed to change.
26%
Flag icon
There is some evidence that the wiring between the cortex and the thalamus—both structures that emerged alongside each other in early vertebrates—are always measuring the level of novelty between incoming sensory data through the thalamus and the patterns represented in the cortex. If there is a match, then no learning is allowed, hence noisy inputs don’t interfere with existing learned patterns. However, if there is a mismatch—if an incoming pattern is sufficiently new—then this triggers a process of neuromodulator release, which triggers changes in synaptic connections in the cortex, ...more
27%
Flag icon
The minuscule half-millimeter-thick membrane in the back of the eye—the retina—contains over one hundred million neurons of five different types. Each region of the retina receives input from a different location of the visual field, and each type of neuron is sensitive to different colors and contrasts. As you view each object, a unique pattern of neurons activates a symphony of spikes.
« Prev 1 3 4 5