More on this book
Community
Kindle Notes & Highlights
Read between
October 14, 2018 - February 16, 2019
I learned the important distinction between precision and accuracy in a less malicious context. For Christmas one year my wife bought me a golf range finder to calculate distances on the course from my golf ball to the hole. The device works with some kind of laser; I stand next to my ball in the fairway (or rough) and point the range finder at the flag on the green, at which point the device calculates the exact distance that I’m supposed to hit the ball. This is an improvement upon the standard yardage markers, which give distances only to the center of the green (and are therefore accurate
...more
The concept of “value at risk” allowed firms to quantify with precision the amount of the firm’s capital that could be lost under different scenarios.
Even the most precise and accurate descriptive statistics can suffer from a more fundamental problem: a lack of clarity over what exactly we are trying to define, describe, or explain.
In terms of output—the total value of goods produced and sold—the U.S. manufacturing sector grew steadily in the 2000s, took a big hit during the Great Recession, and has since bounced back robustly. This is consistent with data from the CIA’s World Factbook showing that the United States is the third-largest manufacturing exporter in the world, behind China and Germany. The United States remains a manufacturing powerhouse. But the graph in the Economist has a second line, which is manufacturing employment. The number of manufacturing jobs in the United States has fallen steadily; roughly six
...more
In this case (and many others), the most complete story comes from including both figures, as the Economist wisely chose to do in its graph.
The unit of analysis is the entity being compared or described by the statistics—school performance by one of them and student performance by the other. It’s
Politician A (a populist): “Our economy is in the crapper! Thirty states had falling incomes last year.” Politician B (more of an elitist): “Our economy is showing appreciable gains: Seventy percent of Americans had rising incomes last year.”
But hold on a moment. The same data can (and should) be interpreted entirely differently if one changes the unit of analysis. We don’t care about poor countries; we care about poor people. And a high proportion of the world’s poor people happen to live in China and India. Both countries are huge (with a population over a billion); each was relatively poor in 1980. Not only have China and India grown rapidly over the past several decades, but they have done so in large part because of their increased economic integration with the rest of the world. They are “rapid globalizers,” as the Economist
...more
“If you consider people, not countries, global inequality is falling rapidly.”
Our old friends the mean and the median can also be used for nefarious ends. As you should recall from the last chapter, both the median and the mean are measures of the “middle” of a distribution, or its “central tendency.” The mean is a simple average: the sum of the observations divided by the number of observations. (The mean of 3, 4, 5, 6, and 102 is 24.)
The median is the midpoint of the distribution; half of the observations lie above the median and half lie below. (The median of 3, 4, 5, 6, and 102 is 5.) Now, the clever reader will see that there is a sizable difference between 24 and 5. If, for some reason, I would like to describe this group of numbers in a way that makes it look big, I will focus on the mean. If I want to make it look smaller, I will cite the median.
Would most of those people be getting a tax cut of around $1,000? No. The median tax cut was less than $100. A relatively small number
Of course, the median can also do its share of dissembling because it is not sensitive to outliers.
Yet the median may be a horribly misleading statistic in this case. Suppose that many patients do not respond to the new treatment but that some large number of patients, say 30 or 40 percent, are cured entirely. This success would not show up in the median (though the mean life expectancy of those taking the drug would look very impressive). In this case, the outliers—those who take the drug and live for a long time—would be highly relevant to your decision.
Evolutionary biologist Stephen Jay Gould was diagnosed with a form of cancer that had a median survival time of eight months; he died of a different and unrelated kind of cancer twenty years later.3 Gould subsequently wrote a famous article called “The Median Isn’t the Message,” in which he argued that his scientific knowledge of statistics saved him from the erroneous conclusion that he would necessarily be dead in eight months.
Because of inflation, something that cost $1 in 1950 would cost $9.37 in 2011. As a result, any monetary comparison between 1950 and 2011 without adjusting for changes in the value of the dollar would be less accurate than comparing figures in euros and pounds—since the euro and the pound are closer to each other in value than a 1950 dollar is to a 2011 dollar.
Nominal figures are not adjusted for inflation. A comparison of the nominal cost of a government program in 1970 to the nominal cost of the same program in 2011 merely compares the size of the checks that the Treasury wrote in those two years—
Real figures, on the other hand, are adjusted for inflation.
Percentages don’t lie—but they can exaggerate. One way to make growth look explosive
I live in Cook County, Illinois. I was shocked one day to learn that the portion of my taxes supporting the Suburban Cook County Tuberculosis Sanitarium District was slated to rise by 527 percent!
Researchers will sometimes qualify a growth figure by pointing out that it is “from a low base,” meaning that any increase is going to look large by comparison.
a similar vein, your kindhearted boss might point out that as a matter of fairness, every employee will be getting the same raise this year, 10 percent. What a magnanimous gesture—except that if your boss makes $1 million and you make $50,000, his raise will be $100,000 and yours will be $5,000. The statement “everyone will get the same 10 percent raise this year” just sounds so much better than
There is a common business aphorism: “You can’t manage what you can’t measure.” True. But you had better be darn sure that what you are measuring is really what you are trying to manage.
What we need is some measure of “value-added” at the school level, or even at the classroom level. We
because of the public mortality statistics, some patients who might benefit from angioplasty might not receive the procedure; 79 percent of the doctors said that some of their personal medical decisions had been influenced by the knowledge that mortality data are collected and made public. The sad paradox of this seemingly helpful descriptive statistic is that cardiologists responded rationally by withholding care from the patients who needed it most.
And for the Human Development Index, how should a country’s literacy rate be weighted in the index relative to per capita income? In the end, the important question is whether the simplicity and ease of use introduced by collapsing many indicators into a single number outweighs the inherent inaccuracy of the process.
The USNWR rankings use sixteen indicators to score and rank America’s colleges, universities, and professional schools. In 2010, for example, the ranking of national universities and liberal arts colleges used “student selectivity” as 15 percent of the index;
“One concern is simply about its being a list that claims to rank institutions in numerical order, which is a level of precision that those data just don’t support,” says Michael McPherson, the former president of Macalester College in Minnesota.10
rankings, Malcolm Gladwell offers a scathing (though humorous) indictment of the peer assessment methodology. He cites a questionnaire sent out by a former chief justice of the Michigan Supreme Court to roughly one hundred lawyers asking them to rank ten law schools in order of quality. Penn State’s was one of the law schools on the list; the lawyers ranked it near the middle. At the time, Penn State did not have a law school.12
Correlation measures the degree to which two phenomena are related to one another.
For example, there is a correlation between summer temperatures and ice cream sales. When one goes up, so does the
grades. In fact, the best predictor of all is a combination of SAT scores and high school GPA, which has a correlation of .64 with first-year
correlation does not imply causation; a positive or negative association between two variables does not necessarily mean that a change in one of the variables is causing the change in the other.
To calculate the correlation coefficient between two sets of numbers, you would perform the following steps, each of which is illustrated by use of the data on heights and weights for 15 hypothetical students in the table below.
coefficient, r, for two variables x and y is the following: where n = the number of observations; is the mean for variable x; is the mean for variable y; σx is the standard deviation for variable x; σy is the standard deviation for variable y.
Schlitz needed only a mediocre beer and a solid grasp of statistics to know that this ploy—a term I do not use lightly, even when it comes to beer advertising—would almost certainly work out in its favor.
You can’t tell the difference, so you might as well drink Schlitz.”)
call a binomial experiment (also called a Bernoulli trial).
If the taste test is really like a flip of the coin, then basic probability tells us that there was a 98 percent chance that at least 40 of the tasters would pick Schlitz, and an 86 percent chance that at least 45 of the tasters would.† In theory, this wasn’t a very risky gambit at all.
There are two important lessons here: probability is a remarkably powerful tool, and many leading beers in the 1980s were indistinguishable from one another.
Probability is the study of events and outcomes involving an element of uncertainty. Investing in the stock market involves uncertainty.
Let’s start with the easy part: Many events have known probabilities. The probability of flipping heads with a fair coin is ½. The probability of rolling a one with a single die is . Other events have probabilities that can be inferred on the basis of past data. The probability of successfully kicking the extra point after touchdown in professional football is .94, meaning that kickers make, on average, 94 out of every 100 extra-point attempts.
Probabilities do not tell us what will happen for sure; they tell us what is likely to happen and what is less likely to happen.
When it comes to risk, our fears do not always track with what the numbers tell us we should be afraid of. One of the striking findings from Freakonomics, by Steve Levitt and Stephen Dubner, was that swimming pools in the backyard are far more dangerous than guns in the closet.
Humans share similarities in their DNA, just as we share other similarities: shoe size, height, eye color. (More than 99 percent of all DNA is identical among all humans.)
probabilities. In other words, the probability of Event A happening and Event B happening is the probability of Event A multiplied by the probability of Event B. An example makes it much more intuitive. If the probability of flipping heads with a fair coin is ½, then the probability of flipping heads twice in a row is ½ × ½, or ¼. The probability of flipping three heads in a row is ⅛, the probability of four heads in a row is 1/16, and so on. (You should see that the probability of throwing four tails in a row is also 1/16.)
There is one crucial distinction here. This formula is applicable only if the events are independent, meaning that the outcome of one has no effect on the outcome of another.
(This is why your auto insurance rates go up after an accident; it is not simply that the company wants to recover the money that it has paid out for the claim; rather, it now has new information about your probability of crashing in the future, which—after you’ve driven the car through your garage door—has gone up.)
If the events are not mutually exclusive, such as drawing a five or a heart from a deck of cards, the probability of getting A or B consists of the sum of their individual probabilities minus the probability of both events happening. Again, this should make intuitive sense. There are 52 cards in a deck.
The expected value or payoff from some event, say purchasing a lottery ticket, is the sum of all the different outcomes, each weighted by its probability and payoff.