More on this book
Community
Kindle Notes & Highlights
by
Jeff Hawkins
Read between
November 23 - December 7, 2022
In my opinion, the most fundamental problem with most neural networks is a trait they share with AI programs. Both are fatally burdened by their focus on behavior. Whether they are calling these behaviors “answers,” “patterns,” or “outputs,” both AI and neural networks assume intelligence lies in the behavior that a program or a neural network produces after processing a given input. The most important attribute of a computer program or a neural network is whether it gives the correct or desired output. As inspired by Alan Turing, intelligence equals behavior. But intelligence is not just a
...more
History shows that the best solutions to scientific problems are simple and elegant. While the details may be forbidding and the road to a final theory may be arduous, the ultimate conceptual framework is generally simple.
So how can a brain perform difficult tasks in one hundred steps that the largest parallel computer imaginable can’t solve in a million or a billion steps? The answer is the brain doesn’t “compute” the answers to problems; it retrieves the answers from memory. In essence, the answers were stored in memory a long time ago. It only takes a few steps to retrieve something from memory. Slow neurons are not only fast enough to do this, but they constitute the memory themselves. The entire cortex is a memory system. It isn’t a computer at all.
Computers have memory too, in the form of hard drives and memory chips; however, there are four attributes of neocortical memory that are fundamentally different from computer memory: • The neocortex stores sequences of patterns. • The neocortex recalls patterns auto-associatively. • The neocortex stores patterns in an invariant form. • The neocortex stores patterns in a hierarchy.
Truly random thoughts don’t exist. Memory recall almost always follows a pathway of association.
Let’s pop the hood and look at what’s going on in your brain to perform this amazing feat. We know from experiments that if we monitor the activity of neurons in the visual input area of your cortex, called V1, the pattern of activity is different for each different view of her face. Every time the face moves or your eyes make a new fixation, the pattern of activity in V1 changes, much like the changing pattern on the retina. However, if we monitor the activity of cells in your face recognition area—a functional region that’s several steps higher than V1 in the cortical hierarchy—we find
...more
I believe a similar abstraction of form is occurring throughout the cortex, in every region. This is a general property of the neocortex. Memories are stored in a form that captures the essence of relationships, not the details of the moment. When you see, feel, or hear something, the cortex takes the detailed, highly specific input and converts it to an invariant form. It is the invariant form that is stored in memory, and it is the invariant form of each new input pattern that it gets compared to. Memory storage, memory recall, and memory recognition occur at the level of invariant forms.
...more
Our brains use stored memories to constantly make predictions about everything we see, feel, and hear. When I look around the room, my brain is using memories to form predictions about what it expects to experience before I experience it. The vast majority of predictions occur outside of awareness.
Correct predictions result in understanding.
Incorrect predictions result in confusion and prompt you to pay attention.
Prediction is not just one of the things your brain does. It is the primary function of the neocortex, and the foundation of intelligence. The cortex is an organ of prediction. If we want to understand what intelligence is, what creativity is, how your brain works, and how to build intelligent machines, we must understand the nature of these predictions and how the cortex makes them. Even behavior is best understood as a by-product of prediction.
When we look at the world, we perceive clean lines and boundaries separating objects, but the raw data entering our eyes are often noisy and ambiguous. Our cortex fills in the missing or messy sections with what it thinks should be there. We perceive an unambiguous image.
Prediction and behavior are not completely separate, but their relationship is subtle. First, the neocortex appeared on the evolutionary scene after animals already evolved sophisticated behaviors. Therefore, the survival value of the cortex must first be understood in terms of the incremental improvements it could bestow upon the animals’ existing behaviors. Behavior came first, then intelligence. Second, most of what we sense is heavily dependent on what we do and how we move in the world. Therefore prediction and behavior are closely related.
The point is that the cortex evolved primarily to provide a memory of the world. An animal with a large cortex could perceive the world much as you and I do. But humans are unique in the dominant, advanced role the cortex plays in our behavior. It is why we have complex language and intricate tools whereas other animals don’t. It is why we can write novels, surf the Internet, send probes to Mars, and build cruise ships.
Now we can see the entire picture. Nature first created animals such as reptiles with sophisticated senses and sophisticated but relatively rigid behaviors. It then discovered that by adding a memory system and feeding the sensory stream into it, the animal could remember past experiences. When the animal found itself in the same or a similar situation, the memory would be recalled, leading to a prediction of what was likely to happen next. Thus, intelligence and understanding started as a memory system that fed predictions into the sensory stream. These predictions are the essence of
...more
If Searle’s Chinese Room contained a similar memory system that could make predictions about what Chinese characters would appear next and what would happen next in the story, we could say with confidence that the room understood Chinese and understood the story. We can now see where Alan Turing went wrong. Prediction, not behavior, is the proof of intelligence.
You can only experience a subset of the world at any moment in time. You can only be in one room of your home, looking in one direction. Because of the hierarchy of the cortex, you are able to know that you are at home, in your living room, looking at a window, even though at that moment your eyes happen to be fixated on a window latch. Higher regions of cortex are maintaining a representation of your home, while lower regions are representing rooms, and still lower regions are looking at a window. Similarly, the hierarchy allows you to know you are listening to both a song and an album of
...more
Since we can only touch, hear, and see a very small part of the world at any moment in time, information flowing into the brain naturally arrives as a sequence of patterns. The cortex wants to learn those sequences that occur over and over again. In some cases, such as melodies, a sequence of patterns comes in a rigid order, the order of the intervals. Most of us are familiar with that kind of sequence. But I am going to use the word sequence in a more general way, closer in meaning to the mathematical term set. A sequence is a set of patterns that generally accompany each other but not always
...more
When I look at your face, the sequence of input patterns I see is not fixed but is determined by my saccades. One time I might fixate in the order “eye eye nose mouth,” and a moment later fixate in the order “mouth eye nose eye.” The components of a face are a sequence. They are statistically related and tend to occur together in time, although the order may vary. If you perceive “face” while fixating on “nose,” the likely next patterns would be “eye” or “mouth” but not “pen” or “car.” Each region of cortex sees a stream of such patterns. If the patterns are related in such a way that the
...more
This highlight has been truncated due to consecutive passage length restrictions.
Predictability is the very definition of reality. If a region of cortex finds it can reliably and predictably move among these input patterns using a series of physical motions (such as saccades of the eyes or fondling with the fingers) and can predict them accurately as they unfold in time (such as the sounds comprising a song or a spoken word), the brain interprets these as having a causal relationship. The odds of numerous input patterns occurring in the same relation over and over again by sheer coincidence are vanishingly small. A predictable sequence of patterns must be part of a larger
...more
In cortical regions, bottom-up classifications and top-down sequences are constantly interacting, changing throughout your life. This is the essence of learning. In fact, all regions of the cortex are plastic, thus they can be modified by experience. Forming new classifications and new sequences is how you remember the world.
For high-level invariant predictions to propagate down the cortex and become specific predictions, we must have a mechanism that allows the flow of patterns to branch at each level.
Every moment in your waking life, each region of your neocortex is comparing a set of expected columns driven from above with the set of observed columns driven from below. Where the two sets intersect is what we perceive. If we had perfect input from below and perfect predictions, then the set of perceived columns would always be contained in the set of predicted columns. We often don’t have such agreement. The method of combining partial prediction with partial input resolves ambiguous input, it fills in missing pieces of information, and it decides between alternative views. It is how we
...more
The process of generating the sequence of predictions of what I will see, feel, and hear while walking from the living room to the kitchen also generates the sequence of motor commands that makes me walk from my living room to my kitchen and move my eyes as I do so. Prediction and motor behavior work hand in hand as patterns flow down and up the cortical hierarchy. As strange as it sounds, when your own behavior is involved, your predictions not only precede sensation, they determine sensation. Thinking of going to the next pattern in a sequence causes a cascading prediction of what you should
...more
The higher the unexpected pattern needs to go, the more regions of the cortex get involved in resolving the unexpected input. Finally, when a region somewhere up the hierarchy thinks it can understand the unexpected event, it generates a new prediction. This new prediction propagates down the hierarchy as far as it can go. If the new prediction is not right, an error will be detected, and again it will climb up the hierarchy until some region can interpret it as part of its currently active sequence. Thus we can see that observed patterns flow up the hierarchy and predictions flow down the
...more
recent years there has been a growing group of scientists who have proposed that synapses on distant, thin dendrites can play an active and highly specific role in cell firing. In these models, these distant synapses behave differently from synapses on thicker dendrites near the cell body.
If you study a particular set of objects over and over, your cortex re-forms memory representations for those objects down the hierarchy. This frees up the top for learning more subtle, more complex relationships. According to the theory, this is what makes an expert.
The world is not random, nor is it homogeneous. Memory, prediction, and behavior would be meaningless if the world was without structure. All behavior, whether it is the behavior of a human, a snail, a single-cell organism, or a tree, is a means of exploiting the structure of the world for the benefit of reproduction.
language fits nicely into the memory-prediction framework without any special language sauce or dedicated language machinery. Spoken and written words are just patterns in the world, as are melodies, cars, and houses. The syntax and semantics of language are not different from the hierarchical structure of other everyday objects. And in the same way that we associate the sound of a train with the visual memory image of a train, we associate spoken words with our memory of their physical and semantic counterparts. Through language one human can invoke memories and create new juxtapositions of
...more
intelligence could be traced over three epochs, each using memory and prediction. The first would be when species used DNA as the medium for memory. Individuals could not learn and adapt within their lifetimes. They could only pass on the DNA-based memory of the world to their offspring through their genes. The second epoch began when nature invented modifiable nervous systems that could quickly form memories. An individual could now learn about the structure of its world and adapt its behavior accordingly within its lifetime. But an individual still could not communicate this knowledge to its
...more
creativity is an inherent property of every cortical region. It is a necessary component of prediction.
Your culture (and family experience) teaches you stereotypes, which are unfortunately an unavoidable part of life. Throughout this book, you could substitute the word stereotype for invariant memory (or invariant representation) without substantially altering the meaning. Prediction by analogy is pretty much the same as judgment by stereotype. Negative stereotyping has terrible social consequences. If my theory of intelligence is right, we cannot rid people of their propensity to think in stereotypes, because stereotypes are how the cortex works. Stereotyping is an inherent feature of the
...more
brains make predictions by analogy to the past. So our natural inclination is to imagine that a new technology will be used to do the same kinds of things as a previous technology. We imagine using a new tool to do something familiar, only faster, more efficiently, or more cheaply. Examples are abundant. People called the railroad the “iron horse” and the automobile the “horseless carriage.” For decades the telephone was viewed in the context of the telegraph, something that should be used only to communicate important news or emergencies; it wasn’t until the 1920s that people started using it
...more
Larger hierarchies learn deeper patterns and see more complex analogies.