Naked Statistics: Stripping the Dread from the Data
Rate it:
Open Preview
Read between October 14, 2018 - February 16, 2019
15%
Flag icon
I learned the important distinction between precision and accuracy in a less malicious context. For Christmas one year my wife bought me a golf range finder to calculate distances on the course from my golf ball to the hole. The device works with some kind of laser; I stand next to my ball in the fairway (or rough) and point the range finder at the flag on the green, at which point the device calculates the exact distance that I’m supposed to hit the ball. This is an improvement upon the standard yardage markers, which give distances only to the center of the green (and are therefore accurate ...more
15%
Flag icon
The concept of “value at risk” allowed firms to quantify with precision the amount of the firm’s capital that could be lost under different scenarios.
15%
Flag icon
Even the most precise and accurate descriptive statistics can suffer from a more fundamental problem: a lack of clarity over what exactly we are trying to define, describe, or explain.
15%
Flag icon
In terms of output—the total value of goods produced and sold—the U.S. manufacturing sector grew steadily in the 2000s, took a big hit during the Great Recession, and has since bounced back robustly. This is consistent with data from the CIA’s World Factbook showing that the United States is the third-largest manufacturing exporter in the world, behind China and Germany. The United States remains a manufacturing powerhouse. But the graph in the Economist has a second line, which is manufacturing employment. The number of manufacturing jobs in the United States has fallen steadily; roughly six ...more
15%
Flag icon
In this case (and many others), the most complete story comes from including both figures, as the Economist wisely chose to do in its graph.
16%
Flag icon
The unit of analysis is the entity being compared or described by the statistics—school performance by one of them and student performance by the other. It’s
16%
Flag icon
Politician A (a populist): “Our economy is in the crapper! Thirty states had falling incomes last year.” Politician B (more of an elitist): “Our economy is showing appreciable gains: Seventy percent of Americans had rising incomes last year.”
16%
Flag icon
But hold on a moment. The same data can (and should) be interpreted entirely differently if one changes the unit of analysis. We don’t care about poor countries; we care about poor people. And a high proportion of the world’s poor people happen to live in China and India. Both countries are huge (with a population over a billion); each was relatively poor in 1980. Not only have China and India grown rapidly over the past several decades, but they have done so in large part because of their increased economic integration with the rest of the world. They are “rapid globalizers,” as the Economist ...more
16%
Flag icon
“If you consider people, not countries, global inequality is falling rapidly.”
16%
Flag icon
Our old friends the mean and the median can also be used for nefarious ends. As you should recall from the last chapter, both the median and the mean are measures of the “middle” of a distribution, or its “central tendency.” The mean is a simple average: the sum of the observations divided by the number of observations. (The mean of 3, 4, 5, 6, and 102 is 24.)
16%
Flag icon
The median is the midpoint of the distribution; half of the observations lie above the median and half lie below. (The median of 3, 4, 5, 6, and 102 is 5.) Now, the clever reader will see that there is a sizable difference between 24 and 5. If, for some reason, I would like to describe this group of numbers in a way that makes it look big, I will focus on the mean. If I want to make it look smaller, I will cite the median.
16%
Flag icon
Would most of those people be getting a tax cut of around $1,000? No. The median tax cut was less than $100. A relatively small number
17%
Flag icon
Of course, the median can also do its share of dissembling because it is not sensitive to outliers.
17%
Flag icon
Yet the median may be a horribly misleading statistic in this case. Suppose that many patients do not respond to the new treatment but that some large number of patients, say 30 or 40 percent, are cured entirely. This success would not show up in the median (though the mean life expectancy of those taking the drug would look very impressive). In this case, the outliers—those who take the drug and live for a long time—would be highly relevant to your decision.
17%
Flag icon
Evolutionary biologist Stephen Jay Gould was diagnosed with a form of cancer that had a median survival time of eight months; he died of a different and unrelated kind of cancer twenty years later.3 Gould subsequently wrote a famous article called “The Median Isn’t the Message,” in which he argued that his scientific knowledge of statistics saved him from the erroneous conclusion that he would necessarily be dead in eight months.
17%
Flag icon
Because of inflation, something that cost $1 in 1950 would cost $9.37 in 2011. As a result, any monetary comparison between 1950 and 2011 without adjusting for changes in the value of the dollar would be less accurate than comparing figures in euros and pounds—since the euro and the pound are closer to each other in value than a 1950 dollar is to a 2011 dollar.
17%
Flag icon
Nominal figures are not adjusted for inflation. A comparison of the nominal cost of a government program in 1970 to the nominal cost of the same program in 2011 merely compares the size of the checks that the Treasury wrote in those two years—
17%
Flag icon
Real figures, on the other hand, are adjusted for inflation.
18%
Flag icon
Percentages don’t lie—but they can exaggerate. One way to make growth look explosive
18%
Flag icon
I live in Cook County, Illinois. I was shocked one day to learn that the portion of my taxes supporting the Suburban Cook County Tuberculosis Sanitarium District was slated to rise by 527 percent!
18%
Flag icon
Researchers will sometimes qualify a growth figure by pointing out that it is “from a low base,” meaning that any increase is going to look large by comparison.
18%
Flag icon
a similar vein, your kindhearted boss might point out that as a matter of fairness, every employee will be getting the same raise this year, 10 percent. What a magnanimous gesture—except that if your boss makes $1 million and you make $50,000, his raise will be $100,000 and yours will be $5,000. The statement “everyone will get the same 10 percent raise this year” just sounds so much better than
19%
Flag icon
There is a common business aphorism: “You can’t manage what you can’t measure.” True. But you had better be darn sure that what you are measuring is really what you are trying to manage.
19%
Flag icon
What we need is some measure of “value-added” at the school level, or even at the classroom level. We
20%
Flag icon
because of the public mortality statistics, some patients who might benefit from angioplasty might not receive the procedure; 79 percent of the doctors said that some of their personal medical decisions had been influenced by the knowledge that mortality data are collected and made public. The sad paradox of this seemingly helpful descriptive statistic is that cardiologists responded rationally by withholding care from the patients who needed it most.
20%
Flag icon
And for the Human Development Index, how should a country’s literacy rate be weighted in the index relative to per capita income? In the end, the important question is whether the simplicity and ease of use introduced by collapsing many indicators into a single number outweighs the inherent inaccuracy of the process.
20%
Flag icon
The USNWR rankings use sixteen indicators to score and rank America’s colleges, universities, and professional schools. In 2010, for example, the ranking of national universities and liberal arts colleges used “student selectivity” as 15 percent of the index;
21%
Flag icon
“One concern is simply about its being a list that claims to rank institutions in numerical order, which is a level of precision that those data just don’t support,” says Michael McPherson, the former president of Macalester College in Minnesota.10
21%
Flag icon
rankings, Malcolm Gladwell offers a scathing (though humorous) indictment of the peer assessment methodology. He cites a questionnaire sent out by a former chief justice of the Michigan Supreme Court to roughly one hundred lawyers asking them to rank ten law schools in order of quality. Penn State’s was one of the law schools on the list; the lawyers ranked it near the middle. At the time, Penn State did not have a law school.12
22%
Flag icon
Correlation measures the degree to which two phenomena are related to one another.
22%
Flag icon
For example, there is a correlation between summer temperatures and ice cream sales. When one goes up, so does the
23%
Flag icon
grades. In fact, the best predictor of all is a combination of SAT scores and high school GPA, which has a correlation of .64 with first-year
23%
Flag icon
correlation does not imply causation; a positive or negative association between two variables does not necessarily mean that a change in one of the variables is causing the change in the other.
24%
Flag icon
To calculate the correlation coefficient between two sets of numbers, you would perform the following steps, each of which is illustrated by use of the data on heights and weights for 15 hypothetical students in the table below.
24%
Flag icon
coefficient, r, for two variables x and y is the following: where n = the number of observations; is the mean for variable x; is the mean for variable y; σx is the standard deviation for variable x; σy is the standard deviation for variable y.
24%
Flag icon
Schlitz needed only a mediocre beer and a solid grasp of statistics to know that this ploy—a term I do not use lightly, even when it comes to beer advertising—would almost certainly work out in its favor.
24%
Flag icon
You can’t tell the difference, so you might as well drink Schlitz.”)
25%
Flag icon
call a binomial experiment (also called a Bernoulli trial).
25%
Flag icon
If the taste test is really like a flip of the coin, then basic probability tells us that there was a 98 percent chance that at least 40 of the tasters would pick Schlitz, and an 86 percent chance that at least 45 of the tasters would.† In theory, this wasn’t a very risky gambit at all.
25%
Flag icon
There are two important lessons here: probability is a remarkably powerful tool, and many leading beers in the 1980s were indistinguishable from one another.
25%
Flag icon
Probability is the study of events and outcomes involving an element of uncertainty. Investing in the stock market involves uncertainty.
25%
Flag icon
Let’s start with the easy part: Many events have known probabilities. The probability of flipping heads with a fair coin is ½. The probability of rolling a one with a single die is . Other events have probabilities that can be inferred on the basis of past data. The probability of successfully kicking the extra point after touchdown in professional football is .94, meaning that kickers make, on average, 94 out of every 100 extra-point attempts.
26%
Flag icon
Probabilities do not tell us what will happen for sure; they tell us what is likely to happen and what is less likely to happen.
26%
Flag icon
When it comes to risk, our fears do not always track with what the numbers tell us we should be afraid of. One of the striking findings from Freakonomics, by Steve Levitt and Stephen Dubner, was that swimming pools in the backyard are far more dangerous than guns in the closet.
26%
Flag icon
Humans share similarities in their DNA, just as we share other similarities: shoe size, height, eye color. (More than 99 percent of all DNA is identical among all humans.)
26%
Flag icon
probabilities. In other words, the probability of Event A happening and Event B happening is the probability of Event A multiplied by the probability of Event B. An example makes it much more intuitive. If the probability of flipping heads with a fair coin is ½, then the probability of flipping heads twice in a row is ½ × ½, or ¼. The probability of flipping three heads in a row is ⅛, the probability of four heads in a row is 1/16, and so on. (You should see that the probability of throwing four tails in a row is also 1/16.)
27%
Flag icon
There is one crucial distinction here. This formula is applicable only if the events are independent, meaning that the outcome of one has no effect on the outcome of another.
27%
Flag icon
(This is why your auto insurance rates go up after an accident; it is not simply that the company wants to recover the money that it has paid out for the claim; rather, it now has new information about your probability of crashing in the future, which—after you’ve driven the car through your garage door—has gone up.)
27%
Flag icon
If the events are not mutually exclusive, such as drawing a five or a heart from a deck of cards, the probability of getting A or B consists of the sum of their individual probabilities minus the probability of both events happening. Again, this should make intuitive sense. There are 52 cards in a deck.
27%
Flag icon
The expected value or payoff from some event, say purchasing a lottery ticket, is the sum of all the different outcomes, each weighted by its probability and payoff.