More on this book
Community
Kindle Notes & Highlights
Read between
June 17 - August 8, 2019
a special kind of variation called mutations that occur from random changes in the arrangement of nucleotide bases in DNA.
Functionally more complex animals require more cell types to perform their more diverse functions. Arthropods and mollusks, for example, have dozens of specific tissues and organs, each of which requires “functionally dedicated,” or specialized, cell types.
These new cell types, in turn, require many new and specialized proteins. An epithelial cell lining a gut or intestine, for example, secretes a specific digestive enzyme. This enzyme requires structural proteins to modify its shape and regulatory enzymes to control the secretion of the digestive enzyme itself. Thus, building novel cell types typically requires building novel proteins, which requires assembly instructions for building proteins—that is, genetic information. Thus, an increase in the number of cell types implies an increase in the amount of genetic information.
For over 3 billion years, the living world included little more than one-celled organisms such as bacteria and algae.12 Then, beginning in the late Ediacaran period (about 555–570 million years ago), the first complex multicellular organisms appeared in the rock strata, including sponges and the peculiar Ediacaran biota discussed in Chapter 4.13 This represented a large increase in complexity. Studies of modern animals suggest that the sponges that appeared in the late Precambrian, for example, probably required about ten cell types.14 Then 40 million years later, the Cambrian explosion
...more
Molecular biologists have estimated that a minimally complex single-celled organism would require between 318,000 and 562,000 base pairs of DNA to produce the proteins necessary to maintain life.17 More complex single cells might require upwards of a million base pairs of DNA.
By way of comparison, the genome size of a modern arthropod, the fruit fly Drosophila melanogaster, is approximately 140 million base pairs.18 Thus, transitions from a single cell to colonies of cells to complex animals represent significant—and in principle measurable—increases in genetic information.
the Cambrian explosion of animal life also generated an explosion of genetic information unparalleled in the previous history of life.
So can the neo-Darwinian mechanism explain the dramatic increase in genetic information that appears in the Cambrian explosion?
During the late 1940s, mathematician Claude Shannon, working at the Bell Laboratories, developed a mathematical theory of information. Shannon equated the amount of information transmitted by a sequence of symbols or characters with the amount of uncertainty reduced or eliminated by the transmission of that sequence.20
Imagine a grab bag of tiles with either zero or one etched onto each. Imagine someone producing a series of zeros and ones by reaching into the bag and placing them one by one on a game board. The probability of choosing a zero on the first pick is just 1 in 2. But the probability of choosing two consecutive zeros after placing the first back in the grab bag (and shaking the tiles) is 1 chance in 2 × 2, or 1 chance in 4. This is because there are four possible combinations of digits that could have been chosen—00, 01, 10, or 11. Similarly, the probability of producing any three-letter sequence
...more
22 DNA conveys information, in Shannon’s sense, in virtue of its containing long improbable arrangements of four chemicals—the four bases that fascinated Watson and Crick—adenine, thymine, guanine, and cytosine (A, T, G, and C).
Since each of the four bases has an equal 1 in 4 chance of occurring at each site along the spine of the DNA molecule, biologists can calculate the probability, and thus the Shannon information, or what is technically known as the “information-carrying capacity,” of any particular sequence n bases long. For instance, any particular sequence three bases long has a probability of 1 chance in 4 × 4 × 4, or 1 chance in 64, of occurring—which corresponds to 6 bits of Shannon information.
Yet the applicability of Shannon information theory to molecular biology has, to some degree, obscured a key distinction concerning the type of information that DNA possesses. Although Shannon’s theory measures the amount of information in a sequence of symbols or characters (or chemicals functioning as such), it doesn’t distinguish a meaningful or functional sequence from useless gibberish. For example: “we hold these truths to be self-evident” “ntnyhiznslhtgeqkahgdsjnfplknejmsed” These two sequences are equally long and equally improbable if we imagine them being drawn at random. Thus,
...more
Strands of DNA contain information-carrying capacity—something Shannon’s theory can measure.24 But DNA, like natural languages and computer codes, also contains functional information.
As in computer code, the precise arrangement of characters (or chemicals functioning as characters) allows the sequence to “produce a specific effect.” For this reason, I also like to use the term specified information as a synonym for functional information, because the function of a sequence of characters depends upon the specific arrangement of those characters.
So if the origin of the Cambrian animals required vast amounts of new functional or specified information, what produced this information explosion?
questions about the origin of information have moved decidedly to the forefront of discussions about evolutionary theory.
Is it plausible to think that natural selection working on random mutations in DNA could produce the highly specific arrangements of bases necessary to generate the protein building blocks of new cell types and novel forms of life?
Obviously, if DNA contained an improbable sequence of nucleotide bases in which the arrangement of bases does not matter to the function of the molecule, then random mutational changes in the sequence of bases would not have a detrimental effect on the function of the molecule. But, of course, sequence does affect function. Eden knew that in all computer codes or written text in which the specificity of sequence determines function, random changes in sequence consistently degrade function or meaning. As he explained, “No currently existing formal language can tolerate random changes in the
...more
the need for specificity in the arrangement of DNA bases made it extremely improbable that random mutations would generate new functional genes or proteins as opposed to degrading existing ones.
a distinguished group of mathematicians, engineers, and scientists convened a conference at the Wistar Institute in Philadelphia called “Mathematical Challenges to the Neo-Darwinian Interpretation of Evolution.” Prominent among the attendees were Marcel-Paul Schützenberger, a mathematician and physician at the University of Paris; Stanislaw Ulam, the codesigner of the hydrogen bomb; and Eden himself. The conference also included a number of prominent biologists, including Ernst Mayr, an architect of modern neo-Darwinism, and Richard Lewontin, at the time a professor of genetics and
...more
“The immediate cause of this conference is a pretty widespread sense of dissatisfaction about what has come to be thought of as the accepted evolutionary theory in the English-speaking world, the so-called neo-Darwinian theory.”
The discovery that the genetic information in DNA is stored as a linear array of precisely sequenced nucleotide bases
Eden argued at Wistar that such random changes to written texts or sections of digital code would inevitably degrade the function of information-bearing sequences, particularly when allowed to accumulate.3 For example, the simple phrase “One if by land and two if by sea” will be significantly degraded by just a handful of random changes such as those in bold: “Ine if bg lend and two ik bT Nea.”
He noted that if someone makes even a few random changes in the arrangement of the digital characters in a computer program, “we find that we have no chance (i.e., less than 1/101000) even to see what the modified program would compute: it just jams.”4 Eden argued that much the same problem applied to DNA—that insofar as specific arrangements of bases in DNA function like digital code, random changes to these arrangements would likely efface their function, while attempts to generate completely new sections of genetic text by random means were likely doomed to failure.5 The explanation for
...more
A straightforward calculation supports his intuition. The simpler lock has only 10 × 10 × 10, or 1000, possible combinations of digits—or what mathematicians refer to as “combinatorial” possibilities. The five-dial lock has 10 × 10 × 10 × 10 × 10, or 100,000, combinatorial possibilities.
With a lot of patience, the thief might elect to systematically work his way through the different combinations of digits on the simpler lock, knowing that at some point he will stumble across the correct combination. He shouldn’t even bother with the five-dial lock, since making his way through all of the possible combinations on it would take 100 times as long. The five-dial lock simply has too many possibilities for the thief to have a reasonable chance of opening it by trial and error in the time available to him.
Neo-Darwinism envisions new genetic information arising from random mutations in the DNA.
If at any time from birth to reproduction the right mutation or combination of mutations accumulate in the DNA of cells involved in reproduction (whether sexual or asexual), then information for building a new protein or proteins will pass on to the next generation. When that new protein happens to confer a survival advantage on an organism, the genetic change responsible for the new protein will tend to be passed on to subsequent generations. As favorable mutations accumulate, the features of a population will gradually change over time. Clearly, natural selection plays a crucial role in this
...more
This highlight has been truncated due to consecutive passage length restrictions.
It turns out, however, that many necessary, functional proteins in cells require far, far more than just four amino acids linked in sequence, and necessary genes require far, far more than just a few bases. Most genes—sections of DNA that code for a specific protein—consist of at least one thousand nucleotide bases. That corresponds to 41000—an unimaginably large number—possible base sequences of that length.
This means that an average-length protein represents just one possible sequence among an astronomically large number—20300, or over 10390—of possible amino-acid sequences of that length. Putting these numbers in perspective, there are only 1065 atoms in our Milky Way galaxy and 1080 elementary particles in the known universe. That is what bothered Eden and other mathematically inclined scientists at Wistar. They understood the immensity of the combinatorial spaces associated with even single genes or proteins of average length. They realized that if the mutations themselves were truly
...more
This highlight has been truncated due to consecutive passage length restrictions.
extremely unlikely that random mutations of whatever sort would produce significant amounts of novel and functionally specified information within the time available to the evolutionary process.
the entities that confer functional advantages on organisms—new genes and their corresponding protein products—constitute long linear arrays of precisely sequenced subunits, nucleotide bases in the case of genes and amino acids in the case of proteins. Yet, according to neo-Darwinian theory, these complex and highly specified entities must first arise and provide some advantage before natural selection can act to preserve them.
For even the smallest unit of functional innovation—a novel protein—to arise, many improbable rearrangements of nucleotide bases would need to occur before natural selection had anything new and advantageous to select.
As physicist Stanislaw Ulam explained at the conference, the evolutionary process “seems to require many thousands, perhaps millions, of successive mutations to produce even the easiest complexities we see in life now. It appears, naïvely at least, that no matter how large the probability of a single mutation is, should it be even as great as one-half, you would get this probability raised to a millionth power, which is so very close to zero that the chances of such a chain seem to be practically non-existent.”
He suggested that it was at least possible that “functionally useful proteins are very common in this [combinatorial] space so that almost any polypeptide one is likely to find [as the result of mutation and selection] has a useful function.”10 Many neo-Darwinian biologists subsequently came to favor this possible solution. The solution was this: even though the size of the combinatorial space that mutations needed to search was enormous, the ratio of functional to nonfunctional base or amino-acid sequences in their relevant combinatorial spaces might turn out to be much higher than Eden and
...more
In known codes and language systems, functional sequences do indeed typically represent tiny islands of meaning amid a great sea of gibberish.
Geneticist Michael Denton has shown that in English meaningful words and sentences are extremely rare among the set of possible combinations of letters of a given length, and they become proportionally rarer as sequence length grows.11 The ratio of meaningful 12-letter words to 12-letter sequences is 1/1014; the ratio of meaningful 100-letter sentences to possible 100-letter strings has been estimated as 1/10100. Denton used these figures in 1985 to explain why random letter substitutions inevitably degrade meaning in English text after only a few changes and why the same thing might be true
...more
Yet in 1966 none of the scientists on either side of the debates at Wistar knew how rare or common functional gene and amino-acid sequences are among the corresponding space of total possibilities. Do they occur with a frequency of 1 in 10, 1 in a million, or 1 in a million billion trillion? At the time, these questions could not be answered.
How much variability is allowed in the amino-acid sequences in proteins? Are there enough functional proteins within a relevant combinatorial space of possibilities to render a random mutational search for new proteins plausible?
During the late 1980s and early 1990s, Robert Sauer, a molecular biologist at MIT, performed a series of experiments that first attempted to measure the rarity of proteins within amino-acid sequence space.
During the late 1970s and early 1980s, however, molecular biologists developed technologies for making customized synthetic DNA molecules. Robert Sauer used these techniques to make site-directed changes to DNA sequences of specific genes of known function and then to insert those variants into bacterial cells. He could then evaluate the effect of various targeted alterations to a DNA sequence on the function of their protein products within a bacterial cell culture.
Sauer’s technique allowed him to begin to evaluate how many of the variant sequences, as a percentage of the total, still produced a functional form of the relevant protein (see Fig. 9.3). His initial results confirmed that proteins could indeed tolerate a variety of amino-acid substitutions at many of the sites in the protein chain.
Based on one set of mutagenesis experiments, Sauer and his colleagues estimated the ratio of functional to nonfunctional amino-acid sequences at about 1 to 1063 for a short protein of 92 amino acids in length.14
Using this data about the allowable variability at each site, he estimated the probability of finding one of the allowable sequences among the total number of sequences corresponding to a cytochrome c protein 100 residues in length. He determined the ratio of functional to nonfunctional sequences to be about 1 to 1090 for amino-acid chains of this length.16
Now imagine a new kind of lock with three crucial differences from an ordinary lock. First, with this new alternative lock, there are four positions on every dial that may—in combination with other positions on other dials—open the lock. My bike thief would like this feature of this kind of lock, since it seems to allow more wiggle room at each dial. But he doesn’t like the two other features of this lock. For one, each dial displays one of 20 letters rather than one of 10 numeric digits. Second, instead of 5 dials, there are 100 dials. On the upside, because 4 of the 20 letters on each of the
...more
The odds are 1 chance in 5100 or—if we want to convert that to base 10—roughly 1 chance in 1070
In the same way, Sauer established that though many different combinations of amino acids will produce roughly the same protein structure and function, the sequences capable of producing these functional outcomes are still extremely rare. He showed that for every functional 92-amino-acid sequence there are roughly another 1063 nonfunctional sequences of the same length. To put that ratio in perspective, the probability of attaining a correct sequence by random search would roughly equal the probability of a blind spaceman finding a single marked atom by chance among all the atoms in the Milky
...more
Lehigh University biochemist Michael Behe, cited Sauer’s quantitative estimate of the rarity of proteins as a decisive refutation of the creative power of the mutation and selection mechanism altogether.
Did the mutation and natural selection mechanism have a realistic chance of finding the new genes and proteins necessary to build, for example, a new Cambrian animal?

