More on this book
Community
Kindle Notes & Highlights
by
Ray Kurzweil
Started reading
February 3, 2023
We are capable of hierarchical thinking, of understanding a structure composed of diverse elements arranged in a pattern, representing that arrangement with a symbol, and then using that symbol as an element in a yet more elaborate configuration. This capability takes place in a brain structure called the neocortex, which in humans has achieved a threshold of sophistication and capacity such that we are able to call these patterns ideas. Through an unending recursive process we are capable of building ideas that are ever more complex. We call this vast array of recursively linked ideas
...more
Keep in mind that greatly amplifying a natural phenomenon is precisely what engineering is capable of doing.
In this book I present a thesis I call the pattern recognition theory of mind (PRTM), which, I argue, describes the basic algorithm of the neocortex (the region of the brain responsible for perception, memory, and critical thinking).
mathematical techniques that have evolved in the field of artificial intelligence (such as those used in Watson and Siri, the iPhone assistant) are mathematically very similar to the methods that biology evolved in the form of the neocortex. If understanding language and other phenomena through statistical analysis does not count as true understanding, then humans have no understanding either.
our memories are sequential and in order. They can be accessed in the order that they are remembered. We are unable to directly reverse the sequence of a memory.
there are no images, videos, or sound recordings stored in the brain. Our memories are stored as sequences of patterns. Memories that are not accessed dim over time.
We can recognize a pattern even if only part of it is perceived (seen, heard, felt) and even if it contains alterations. Our recognition ability is apparently able to detect invariant features of a pattern—characteristics that survive real-world variations.
our conscious experience of our perceptions is actually changed by our interpretations.
we are constantly predicting the future and hypothesizing what we will experience. This expectation influences what we actually perceive. Predicting the future is actually the primary reason that we have a brain.
The neocortex is responsible for sensory perception, recognition of everything from visual objects to abstract concepts, controlling movement, reasoning from spatial orientation to rational thought, and language—basically, what we regard as “thinking.”
This thin structure is basically made up of six layers, numbered I (the outermost layer) to VI. The axons emerging from the neurons in layers II and III project to other parts of the neocortex. The axons (output connections) from layers V and VI are connected primarily outside of the neocortex to the thalamus, brain stem, and spinal cord. The neurons in layer IV receive synaptic (input) connections from neurons that are outside the neocortex, especially in the thalamus. The number of layers varies slightly from region to region. Layer IV is very thin in the motor cortex, because in that area
...more
Human beings have only a weak ability to process logic, but a very deep core capability of recognizing patterns.
How many patterns can the neocortex store? We need to factor in the phenomenon of redundancy. The face of a loved one, for example, is not stored once but on the order of thousands of times.
Successful recognition by a module of its pattern goes beyond just counting the input signals that are activated (even a count weighted by the importance parameter). The size (of each input) matters. There is another parameter (for each input) indicating the expected size of the input, and yet another indicating how variable that size is.
In our work in speech recognition, we found that it is necessary to encode this type of information in order to recognize speech patterns. For example, the words “step” and “steep” are very similar. Although the [e] phoneme in “step” and the [E] in “steep” are somewhat different vowel sounds (in that they have different resonant frequencies), it is not reliable to distinguish these two words based on these often confusable vowel sounds.
In our speech examples, the “size” parameter refers to duration, but time is only one possible dimension. In our work in character recognition, we found that comparable spatial information was important in order to recognize printed letters (for example the dot over the letter “i” is expected to be much smaller than the portion under the dot).
If anything, this downward flow is even more significant. If, for example, we are reading from left to right and have already seen and recognized the letters “A,” “P,” “P,” and “L,” the “APPLE” recognizer will predict that it is likely to see an “E” in the next position. It will send a signal down to the “E” recognizer saying, in effect, “Please be aware that there is a high likelihood that you will see your ‘E’ pattern very soon, so be on the lookout for it.”
the input to each pattern processor is a one-dimensional list, even though the pattern itself may inherently reflect more than one dimension.
especially her mother’s heartbeat, which is one likely reason that the rhythmic qualities of music are universal to human culture. Every human civilization ever discovered has had music as part of its culture, which is not the case with other art forms, such as pictorial art. It is also the case that the beat of music is comparable to our heart rate. Music beats certainly vary—otherwise music would not keep our interest—but heartbeats vary also.
There is a mathematical solution to this optimization problem called linear programming, which solves for the best possible allocation of limited resources (in this case, a limited number of pattern recognizers) that would represent all of the cases on which the system has trained. Linear programming is designed for systems with one-dimensional inputs, which is another reason why it is optimal to represent the input to each pattern recognition module as a linear string of inputs.

