Naked Statistics: Stripping the Dread from the Data
Rate it:
Open Preview
Kindle Notes & Highlights
11%
Flag icon
The mean is the middle line which is often represented by the Greek letter µ.
11%
Flag icon
The standard deviation is often represented by the Greek letter σ.
11%
Flag icon
Descriptive statistics are often used to compare two figures or quantities. I’m one inch taller than my brother; today’s temperature is nine degrees above the historical average for this date; and so on. Those comparisons make sense...
This highlight has been truncated due to consecutive passage length restrictions.
12%
Flag icon
In both the sodium and the income examples, we’re missing context. The easiest way to give meaning to these relative comparisons is by using percentages. It would mean something if I told you that Granola Bar A has 50 percent more sodium than Granola Bar B, or that Uncle Al’s income fell 47 percent last year. Measuring change as a percentage gives us some sense of scale.
12%
Flag icon
Percentages are useful—but also potentially confusing or even deceptive. The formula for calculating a percentage difference (or change) is the following: (new figure – original figure)/original figure. The numerator (the part on the top of the fraction) gives us the size of the change in absolute terms; the denominator (the bottom of the fraction) is what puts this change in context by comparing it with our starting point.
12%
Flag icon
The point is that a percentage change always gives the value of some figure relative to something else. Therefore, we had better understand what that something else is.
12%
Flag icon
Percentage change must not be confused with a change in percentage points. Rates are often expressed in percentages. The sales tax rate in Illinois is 6.75 percent. I pay my agent 15 percent of my book royalties. These rates are levied against some quantity, such as income in the case of the income tax rate. Obviously the rates can go up or down; less intuitively, the changes in the rates can be described in vastly dissimilar ways. The best example of this was a recent change in the Illinois personal income tax, which was raised from 3 percent to 5 percent. There are two ways to express this ...more
12%
Flag icon
The advantage of any index is that it consolidates lots of complex information into a single number.
12%
Flag icon
Alas, the disadvantage of any index is that it consolidates lots of complex information into a single number. There are countless ways to do that; each has the potential to produce a different outcome. Malcolm Gladwell makes this point brilliantly in a New Yorker piece critiquing our compelling need to rank things.2 (He comes down particularly hard on the college rankings.) Gladwell offers the example of Car and Driver’s ranking of three sports cars: the Porsche Cayman, the Chevrolet Corvette, and the Lotus Evora. Using a formula that includes twenty-one different variables, Car and Driver ...more
This highlight has been truncated due to consecutive passage length restrictions.
13%
Flag icon
To assess the economic health of America’s “middle class,” we should examine changes in the median wage (adjusted for inflation) over the last several decades. They also recommended examining changes to wages at the 25th and 75th percentiles (which can reasonably be interpreted as the upper and lower bounds for the middle class).
13%
Flag icon
They do tell us that the typical worker, an American worker earning the median wage, has been “running in place” for nearly thirty years. Workers at the 90th percentile have done much, much better. Descriptive statistics help to frame the issue. What we do about it, if anything, is an ideological and political question.
13%
Flag icon
Variance and standard deviation are the most common statistical mechanisms for measuring and describing the dispersion of a distribution.
14%
Flag icon
The variance, which is often represented by the symbol σ2, is calculated by determining how far the observations within a distribution lie from the mean.
14%
Flag icon
Because the difference between each term and the mean is squared, the formula for calculating variance puts particular weight on observations that lie far from the mean, or outliers,
14%
Flag icon
Absolute value is the distance between two figures, regardless of direction, so that it is always positive.
14%
Flag icon
Variance is rarely used as a descriptive statistic on its own. Instead, the variance is most useful as a step toward calculating the standard deviation of a distribution, which is a more intuitive tool as a descriptive statistic. The standard deviation for a set of observations is the square root of the variance:
14%
Flag icon
Mark Twain famously remarked that there are three kinds of lies: lies, damned lies, and statistics.* As the last chapter explained, most phenomena that we care about can be described in multiple ways. Once there are multiple ways of describing the same thing (e.g., “he’s got a great personality” or “he was convicted of securities fraud”), the descriptive statistics that we choose to use (or not to use) will have a profound impact on the impression that we leave. Someone with nefarious motives can use perfectly good facts and figures to support entirely disputable or illegitimate conclusions.
14%
Flag icon
Precision reflects the exactitude with which we can express something.
14%
Flag icon
Accuracy is a measure of whether a figure is broadly consistent with the truth—hence the danger of confusing precision with accuracy.
15%
Flag icon
precision can mask inaccuracy by giving us a false sense of certainty, either inadvertently or quite deliberately.
15%
Flag icon
even the most precise measurements or calculations should be checked against common sense.
15%
Flag icon
Even the most precise and accurate descriptive statistics can suffer from a more fundamental problem: a lack of clarity over what exactly we are trying to define, describe, or explain.
15%
Flag icon
Together, these two stories—rising manufacturing output and falling employment—tell the complete story. Manufacturing in the United States has grown steadily more productive, meaning that factories are producing more output with fewer workers. This is good from a global competitiveness standpoint, for it makes American products more competitive with manufactured goods from low-wage countries. (One way to compete with a firm that can pay workers $2 an hour is to create a manufacturing process so efficient that one worker earning $40 can do twenty times as much.) But there are a lot fewer ...more
16%
Flag icon
Is globalization making income inequality around the planet better or worse? By one interpretation, globalization has merely exacerbated existing income inequalities; richer countries in 1980 (as measured by GDP per capita) tended to grow faster between 1980 and 2000 than poorer countries.2 The rich countries just got richer, suggesting that trade, outsourcing, foreign investment, and the other components of “globalization” are merely tools for the developed world to extend its economic hegemony. Down with globalization! Down with globalization! But hold on a moment. The same data can (and ...more
This highlight has been truncated due to consecutive passage length restrictions.
16%
Flag icon
both the median and the mean are measures of the “middle” of a distribution, or its “central tendency.”
16%
Flag icon
mean is a simple average: the sum of the observations divided by the number of observations.
16%
Flag icon
The median is the midpoint of the distribution; half of the observations lie above the ...
This highlight has been truncated due to consecutive passage length restrictions.
16%
Flag icon
If, for some reason, I would like to describe this group of numbers in a way that makes it look big, I will focus on the mean. If I want to make it look smaller, I will cite the median.
16%
Flag icon
Consider the George W. Bush tax cuts, which were touted by the Bush administration as something good for most American families. While pushing the plan, the administration pointed out that 92 million Americans would receive an average tax reduction of over $1,000 ($1,083 to be precise). But was that summary of the tax cut accurate? According to the New York Times, “The data don’t lie, but some of them are mum.” Would 92 million Americans be getting a tax cut? Yes. Would most of those people be getting a tax cut of around $1,000? No. The median tax cut was less than $100. A relatively small ...more
17%
Flag icon
Suppose that you have a potentially fatal illness. The good news is that a new drug has been developed that might be effective. The drawback is that it’s extremely expensive and has many unpleasant side effects. “But does it work?” you ask. The doctor informs you that the new drug increases the median life expectancy among patients with your disease by two weeks. That is hardly encouraging news; the drug may not be worth the cost and unpleasantness. Your insurance company refuses to pay for the treatment; it has a pretty good case on the basis of the median life expectancy figures. Yet the ...more
17%
Flag icon
Evolutionary biologist Stephen Jay Gould was diagnosed with a form of cancer that had a median survival time of eight months; he died of a different and unrelated kind of cancer twenty years later.3 Gould subsequently wrote a famous article called “The Median Isn’t the Message,” in which he argued that his scientific knowledge of statistics saved him from the erroneous conclusion that he would necessarily be dead in eight months. The definition of the median tells us that half the patients will live at least eight months—and possibly much, much longer than that. The mortality distribution is ...more
17%
Flag icon
In contrast, the mean is affected by dispersion.
17%
Flag icon
From the standpoint of accuracy, the median versus mean question revolves around whether the outliers in a distribution distort what is being described or are instead an important part of the message.
17%
Flag icon
instead, they overlook a more subtle example of apples and oranges: inflation. A dollar today is not the same as a dollar sixty years ago; it buys much less. Because of inflation, something that cost $1 in 1950 would cost $9.37 in 2011. As a result, any monetary comparison between 1950 and 2011 without adjusting for changes in the value of the dollar would be less accurate than comparing figures in euros and pounds—since the euro and the pound are closer to each other in value than a 1950 dollar is to a 2011 dollar. This is such an important phenomenon that economists have terms to denote ...more
This highlight has been truncated due to consecutive passage length restrictions.
18%
Flag icon
If prices rise faster than Congress raises the minimum wage, the real value of that minimum hourly payment will fall. Supporters of a minimum wage should care about the real value of that wage, since the whole point of the law is to guarantee low-wage workers some minimum level of consumption for an hour of work, not to give them a check with a big number on it that buys less than it used to.
18%
Flag icon
Hollywood studios may be the most egregiously oblivious to the distortions caused by inflation when comparing figures at different points in time—and deliberately so. What were the top five highest-grossing films (domestic) of all time as of 2011?5 1. Avatar (2009) 2. Titanic (1997) 3. The Dark Knight (2008) 4. Star Wars Episode IV (1977) 5. Shrek 2 (2004) Now you may feel that list looks a little suspect. These were successful films—but Shrek 2? Was that really a greater commercial success than Gone with the Wind? The Godfather? Jaws? No, no, and no. Hollywood likes to make each blockbuster ...more
This highlight has been truncated due to consecutive passage length restrictions.
18%
Flag icon
One way to make growth look explosive is to use percentage change to describe some change relative to a very low starting point.
18%
Flag icon
Researchers will sometimes qualify a growth figure by pointing out that it is “from a low base,” meaning that any increase is going to look large by comparison.
18%
Flag icon
Obviously the flip side is true. A small percentage of an enormous sum can be a big number.
18%
Flag icon
In a similar vein, your kindhearted boss might point out that as a matter of fairness, every employee will be getting the same raise this year, 10 percent. What a magnanimous gesture—except that if your boss makes $1 million and you make $50,000, his raise will be $100,000 and yours will be $5,000. The statement “everyone will get the same 10 percent raise this year” just sounds so much better than “my raise will be twenty times bigger than yours.” Both are true in this case.
19%
Flag icon
There is a common business aphorism: “You can’t manage what you can’t measure.” True. But you had better be darn sure that what you are measuring is really what you are trying to manage.
19%
Flag icon
Any evaluation of teachers or schools that is based solely on test scores will present a dangerously inaccurate picture. Students who walk through the front door of different schools have vastly different backgrounds and abilities. We know, for example, that the education and income of a student’s parents have a significant impact on achievement, regardless of what school he or she attends. The statistic that we’re missing in this case happens to be the only one that matters for our purposes: How much of a student’s performance, good or bad, can be attributed to what happens inside the school ...more
19%
Flag icon
What we need is some measure of “value-added” at the school level, or even at the classroom level. We don’t want to know the absolute level of student achievement; we want to know how much that student achievement has been affected by the educational factors we are trying to evaluate.
20%
Flag icon
Statistics measure the outcomes that matter; incentives give us a reason to improve those outcomes. Or, in some cases, just to make the statistics look better. That’s the bad news.
20%
Flag icon
you had better be darn certain that the folks being evaluated can’t make themselves look better (statistically) in ways that are not consistent with the goal at hand.
20%
Flag icon
Cardiologists obviously care about their “scorecard.” However, the easiest way for a surgeon to improve his mortality rate is not by killing fewer people; presumably most doctors are already trying very hard to keep their patients alive. The easiest way for a doctor to improve his mortality rate is by refusing to operate on the sickest patients. According to a survey conducted by the School of Medicine and Dentistry at the University of Rochester, the scorecard, which ostensibly serves patients, can also work to their detriment: 83 percent of the cardiologists surveyed said that, because of ...more
20%
Flag icon
The sad paradox of this seemingly helpful descriptive statistic is that cardiologists responded rationally by withholding care from the patients who needed it most.
21%
Flag icon
An institution that spends less money to better effect (and therefore can charge lower tuition) is punished in the ranking process. Colleges and universities also have an incentive to encourage large numbers of students to apply, including those with no realistic hope of getting in, because it makes the school appear more selective. This is a waste of resources for the schools soliciting bogus applications and for students who end up applying with no meaningful chance of being accepted.
21%
Flag icon
“People love easy answers. What is the best place? Number 1.”
« Prev 1 2 Next »