Tony's Reviews > The Half-Life of Facts: Why Everything We Know Has an Expiration Date
The Half-Life of Facts: Why Everything We Know Has an Expiration Date
by
by

Facts change over time. Some, we expect to change. One hundred years ago, the answer to “How many billion people on earth?” was: two. When I was at school that changed from four to five. Recently, it became seven. Others, like "How many fingers on a human hand?”, we expect to remain constant (at least for a very long time). Sometimes even these sorts of facts, however, change unexpectedly. From 1912 to 1956 scientists were certain there were 48 chromosomes in a human cell. Some even had to abandon research when they found they could only account for 46 of the 48 they knew had to be there.
Other facts, like “How many elements in the periodic table?”, change just slowly enough for those of us who aren’t paying attention to be surprised when the answer turns out to be about 10% higher than when we last looked.
Arbesman calls these sorts of facts — those that change over years or decades, rather than days or millennia — mesofacts. And he argues that how they change is actually fairly predictable. For this he uses the analogy of radioactivity: A single atom of uranium is highly unpredictable: you can’t know whether it might decay in the next minute, or last for another million years. But a chunk of uranium, made up of trillions of such atoms, becomes much more manageable, with a predictable half-life. Similarly, we may not know when any specific fact might be supplanted, but how a body of knowledge, in the aggregate, changes over time, can be measured and understood scientifically.[1]
At first it seemed like the book was going to expand much more on why this is important. In the first chapter he notes that it’s certainly practically useful, for example, to know that many areas of medical knowledge ‘decay’ in under 50 years, making it worthwhile to check semi-regularly whether the facts you’ve based (say) your exercise and diet regimes around are still true[2]. But he also notes that the subtler version is also even more important: simply being aware of how knowledge itself works, at a meta-level, is important for making sense of the world, and anticipating — and planning for — flaws in our knowledge. I had hoped for more expansion on this idea, but instead the book then takes off on a rather disjointed tour of lots of semi-related knowledge-based themes. Mostly this is anecdote driven, but while the author appears to really want to be Malcolm Gladwell, he can’t quite pull it off.
I did find a couple of these areas to be quite fascinating, though:
One is to do with how many previous trials researchers tend to cite, as a proxy for how deeply they study what has come before before simply jumping into their Shiny New Research. Unsurprisingly the answer is “Not very many” — on average only about 25% of papers that should be cited are (and with a heavy bias towards the most recent ones). One particularly striking example of why this can be important is on the research into treating heart attacks with the drug streptokinase. There were over 30 published trials before this was shown to be effective. However a follow-up cumulative meta-analysis found that if each of these trials had not only looked at their own results, but combined them with those of each of the previous trials, a statistically significantly result could have been found 15 years earlier.
The other is the concept of “undiscovered public knowledge” — where, for example, someone has shown that A implies B, and someone else has shown than B implies C, but no-one knows both these things, and therefore "A implies C” lies hidden in the literature as an unknown fact. In a classic example of this, Don Swanson combined two previously unrelated sets of scientific articles — one describing poor blood circulation in patients with Raynaud’s Syndrome; the other showing that dietary fish oil could improve blood circulation — to suggest (with no background in medicine or biology, and based on no research other than pulling together previously published information) that fish oil might be useful as a treatment for Raynaud’s: a finding subsequently backed up in trials.
It seems that this area has become significantly more automated in recent years, with massive databases now monitoring the co-occurence of terms in published papers, to look for potential links, and having successfully discovered previous unknown links between genes and diseases (e.g. Graves’ disease), and also other potential drug treatments like Swanson’s.
Unfortunately, however, the book disappointed. The high-level concept of framing information as having a half-life is certainly appealing, and some of the sections helped frame certain concepts with a little extra clarity, but largely it failed to engage with any of the areas sufficiently well.
—
[1] One simple but effective method for measuring this is to simply get a group of experts in a particular field to re-examine a large number of historic papers, and categorise them as either still factual; substantially correct but out of date; or now disproven. When this was done, for example, with almost 500 articles about liver disease, a strikingly clear graph of knowledge decay become apparent (with a half-life in this particular field of about 45 years). This approach, however, is rather time-consuming, so instead you can make the first-order approximation of measuring how long any given work continues to be cited by others.
[2] I had also hoped (in vain) for a discussion on the flip-side of this: comparing (for example) the rate of "Key New Breakthrough”-type articles to the half-life of information in the field, as a proxy measurement for bad journalism.
Other facts, like “How many elements in the periodic table?”, change just slowly enough for those of us who aren’t paying attention to be surprised when the answer turns out to be about 10% higher than when we last looked.
Arbesman calls these sorts of facts — those that change over years or decades, rather than days or millennia — mesofacts. And he argues that how they change is actually fairly predictable. For this he uses the analogy of radioactivity: A single atom of uranium is highly unpredictable: you can’t know whether it might decay in the next minute, or last for another million years. But a chunk of uranium, made up of trillions of such atoms, becomes much more manageable, with a predictable half-life. Similarly, we may not know when any specific fact might be supplanted, but how a body of knowledge, in the aggregate, changes over time, can be measured and understood scientifically.[1]
At first it seemed like the book was going to expand much more on why this is important. In the first chapter he notes that it’s certainly practically useful, for example, to know that many areas of medical knowledge ‘decay’ in under 50 years, making it worthwhile to check semi-regularly whether the facts you’ve based (say) your exercise and diet regimes around are still true[2]. But he also notes that the subtler version is also even more important: simply being aware of how knowledge itself works, at a meta-level, is important for making sense of the world, and anticipating — and planning for — flaws in our knowledge. I had hoped for more expansion on this idea, but instead the book then takes off on a rather disjointed tour of lots of semi-related knowledge-based themes. Mostly this is anecdote driven, but while the author appears to really want to be Malcolm Gladwell, he can’t quite pull it off.
I did find a couple of these areas to be quite fascinating, though:
One is to do with how many previous trials researchers tend to cite, as a proxy for how deeply they study what has come before before simply jumping into their Shiny New Research. Unsurprisingly the answer is “Not very many” — on average only about 25% of papers that should be cited are (and with a heavy bias towards the most recent ones). One particularly striking example of why this can be important is on the research into treating heart attacks with the drug streptokinase. There were over 30 published trials before this was shown to be effective. However a follow-up cumulative meta-analysis found that if each of these trials had not only looked at their own results, but combined them with those of each of the previous trials, a statistically significantly result could have been found 15 years earlier.
The other is the concept of “undiscovered public knowledge” — where, for example, someone has shown that A implies B, and someone else has shown than B implies C, but no-one knows both these things, and therefore "A implies C” lies hidden in the literature as an unknown fact. In a classic example of this, Don Swanson combined two previously unrelated sets of scientific articles — one describing poor blood circulation in patients with Raynaud’s Syndrome; the other showing that dietary fish oil could improve blood circulation — to suggest (with no background in medicine or biology, and based on no research other than pulling together previously published information) that fish oil might be useful as a treatment for Raynaud’s: a finding subsequently backed up in trials.
It seems that this area has become significantly more automated in recent years, with massive databases now monitoring the co-occurence of terms in published papers, to look for potential links, and having successfully discovered previous unknown links between genes and diseases (e.g. Graves’ disease), and also other potential drug treatments like Swanson’s.
Unfortunately, however, the book disappointed. The high-level concept of framing information as having a half-life is certainly appealing, and some of the sections helped frame certain concepts with a little extra clarity, but largely it failed to engage with any of the areas sufficiently well.
—
[1] One simple but effective method for measuring this is to simply get a group of experts in a particular field to re-examine a large number of historic papers, and categorise them as either still factual; substantially correct but out of date; or now disproven. When this was done, for example, with almost 500 articles about liver disease, a strikingly clear graph of knowledge decay become apparent (with a half-life in this particular field of about 45 years). This approach, however, is rather time-consuming, so instead you can make the first-order approximation of measuring how long any given work continues to be cited by others.
[2] I had also hoped (in vain) for a discussion on the flip-side of this: comparing (for example) the rate of "Key New Breakthrough”-type articles to the half-life of information in the field, as a proxy measurement for bad journalism.
Sign into Goodreads to see if any of your friends have read
The Half-Life of Facts.
Sign In »
Reading Progress
November 17, 2013
–
Started Reading
November 17, 2013
– Shelved
November 17, 2013
– Shelved as:
non-fiction
November 17, 2013
–
Finished Reading
January 5, 2014
– Shelved as:
2013
September 9, 2014
– Shelved as:
reviewed
November 30, 2014
– Shelved as:
edge