Naked Statistics: Stripping the Dread from the Data
Rate it:
Open Preview
Read between October 14, 2018 - February 16, 2019
51%
Flag icon
If the .05 significance level seems somewhat arbitrary, that’s because it is. There is no single standardized statistical threshold for rejecting a null hypothesis. Both .01 and .1 are also reasonably common thresholds for doing the kind of analysis described above.
51%
Flag icon
When you read in the newspaper that people who eat twenty bran muffins a day have lower rates of colon cancer than people who don’t eat prodigious amounts of bran, the underlying academic research probably looked something like this: (1) In some large data set, researchers determined that individuals who ate at least twenty bran muffins a day had a lower incidence of colon cancer than individuals who did not report eating much bran. (2) The researchers’ null hypothesis was that eating bran muffins has no impact on colon cancer. (3) The disparity in colon cancer outcomes between those who ate ...more
51%
Flag icon
(This is the “healthy user bias” from Chapter 7.)
52%
Flag icon
This distinction between correlation and causation is crucial to the proper interpretation of statistical results. We will revisit the idea that “correlation does not equal causation” later in the book. I should also
52%
Flag icon
One of the first questions you want to ask is, How big is this effect? It could easily be .9 points; on a test with a mean score of 500, that is not a life-changing figure. In Chapter 11, we will return to this crucial distinction between size and significance when it comes to interpreting statistical results.
52%
Flag icon
First, you should recognize that each group of children, the 59 with autism and the 38 without autism, constitutes a reasonably large sample drawn from their respective populations—all children with and without autism spectrum disorder. The samples are large enough that the central limit will apply.
53%
Flag icon
We can say with 95 percent confidence that the range 1284.4 to 1336.4 cubic centimeters (the sample mean of 1310.4 ± two standard errors) contains the average total brain volume for children in the general population with autism spectrum disorder.
54%
Flag icon
The powerful process of statistical inference is based on probability, not on some kind of cosmic certainty. We
54%
Flag icon
“Claims that defy almost every law of science are by definition extraordinary and thus require extraordinary evidence. Neglecting to take this into account—as conventional social science analyses do—makes many findings look far more significant than they really are.”
54%
Flag icon
In statistical parlance, this is known as a Type I error. Consider the example of an American
54%
Flag icon
threshold to something like “a strong hunch that the guy did it.” This is going to ensure that more criminals go to jail—and also more innocent people. In a statistical context, this is the equivalent of having a relatively low significance level, such as .1.
54%
Flag icon
This is known as a Type II error, or false negative.
55%
Flag icon
nor a Type II error is acceptable in this situation, which is why society continues to debate about the appropriate balance between fighting terrorism and protecting civil liberties.
55%
Flag icon
where = mean for sample x = mean for sample y sx = standard deviation for sample x sy = standard deviation for sample y nx = number of observations in sample x ny = number of observations in sample y
55%
Flag icon
One- and Two-Tailed Hypothesis Testing
56%
Flag icon
We will therefore reject our null hypothesis if our sample of male basketball players has a mean height that is significantly higher or lower than the mean height for our sample of normal men. This requires a two-tailed hypothesis test. The cutoff points for rejecting our null hypothesis will be different because we must now account for the possibility of a large difference in sample means in both directions: positive or negative.
57%
Flag icon
that a poll has a “margin of error” of ± 3 percent, this is really just the same kind of 95 percent confidence interval that we calculated in the last chapter. Our “95 percent confidence” means that if we conducted 100 different polls on samples drawn from the same population, we would expect the answers we get from our sample in 95 of those polls to be within 3 percentage points in one direction or the other of the population’s true sentiment.
57%
Flag icon
One fundamental difference between a poll and other forms of sampling is that the sample statistic we care about will be not a mean (e.g., 187 pounds) but rather a percentage or proportion (e.g., 47 percent of voters, or .47).
57%
Flag icon
The standard error is what tells us how much dispersion we can expect in our results from sample to sample,
58%
Flag icon
You explain that the answer depends on how confident the network people would like to be in the announcement—or, more specifically, what risk they are willing to take that they will get it wrong. Remember, the standard error gives us a sense of how often we can expect our sample proportion (the exit poll) to lie reasonably close to the true population proportion (the election outcome).
58%
Flag icon
candidate has earned 53 percent of the vote ± 4 percent, or between 49 and 57 percent of the votes cast. Meanwhile, the Democratic candidate has earned 45 percent ± 4 percent, or between 41 and 49 percent of the votes cast. And, yes, now you have a new problem. At the 95 percent confidence level, you cannot reject the possibility that the two candidates may be tied with 49 percent of the vote each. This is an inevitable trade-off; the only way to become more certain that your polling results will be consistent with the election outcome without new data is to become more timid in your ...more
58%
Flag icon
By being less specific. You are “absolutely positive” that Thomas Jefferson was one of the first five presidents.
58%
Flag icon
As a result, the standard error will shrink significantly. The new standard error for the Republican candidate is which is .01.
59%
Flag icon
For that reason, I have adopted a common convention, which is to take the higher standard error of the two and use that for all of the candidates.
59%
Flag icon
Is this an accurate sample of the population whose opinions we are trying to measure? Many common data-related challenges
60%
Flag icon
The fact that American attitudes toward capital punishment change dramatically when life without parole is offered as an option tells us something important. The key point, says Newport, is to view any polling result in context. No single question or poll can capture the full depth of public opinion on a complex issue.
61%
Flag icon
their true incidence in the population by 20 percent [(60 – 50)/50]. And in so doing, you have also undercounted the Democrats by 20 percent [(40 – 50)/50]. That could happen, even with a decent polling methodology. Your second
62%
Flag icon
meaning they have minimal say over what tasks are performed or how those tasks are carried out—have a significantly higher mortality rate than other workers in the civil service with more decision-making authority. According to this research, it is not the stress associated with major responsibilities that will kill you; it is the stress associated with being told what to do while having little say in how or when it gets done.
62%
Flag icon
regression analysis.
62%
Flag icon
Different families make different child care decisions because they are different.
63%
Flag icon
Now, there are two key phrases in that last sentence. The first is “when done properly.” Given adequate data and access to a personal computer, a six-year-old could use a basic
63%
Flag icon
The second important phrase above is “help us estimate.” Our child care study does not give us a “right” answer for the relationship between day care and subsequent school performance. Instead, it quantifies the relationship observed for a particular group of children over a particular stretch of time.
63%
Flag icon
disease. We are instead rejecting the null hypothesis that exercise has no association with heart disease, on the basis of some statistical threshold that was chosen before the study was conducted.
63%
Flag icon
would be less than 5 in 100, or below some other threshold for statistical significance.
63%
Flag icon
Or perhaps causality goes the other direction. Could having a healthy heart “cause” exercise? Yes. Individuals who are infirm, particularly those who have some incipient form of heart disease, will find it much harder to exercise.
63%
Flag icon
This is not a terribly insightful or specific statement. Regression analysis enables us to go one step further and “fit a line” that best describes a linear relationship between the two variables. Many possible lines
63%
Flag icon
uses a methodology called ordinary least squares, or OLS. The technical details, including why OLS produces the best
63%
Flag icon
OLS fits the line that minimizes the sum of the squared residuals.
64%
Flag icon
which is that ordinary least squares gives us the best description of a linear relationship between two variables.
64%
Flag icon
This is known as the regression equation, and it takes the following form: y = a + bx, where y is weight in pounds; a is the y-intercept of the line (the value for y when x = 0); b is the slope of the line; and x is height in inches. The slope of the line we’ve fitted, b, describes the “best” linear relationship between height and weight for this sample, as defined by ordinary least squares.
64%
Flag icon
weight in this case—is known as the dependent variable (because it depends on other factors). The variables that we are using to explain our dependent variable are known as explanatory variables since they explain the outcome that we care
64%
Flag icon
This figure is also known as the constant, because it is the starting point for calculating the weight of all observations in the study.
64%
Flag icon
questions. For any regression coefficient, you will generally be interested in three things: sign, size, and significance.
64%
Flag icon
Sign.
64%
Flag icon
data on something like “miles run per month,” I am fairly certain that the coefficient on “miles run” would be negative. Running more is associated with weighing less.
64%
Flag icon
Size. How big is the observed effect between the independent variable and the dependent variable? Is it of a magnitude that matters? In this case, every one inch in height is associated with 4.5
64%
Flag icon
socially insignificant.
64%
Flag icon
example, suppose that we are examining determinants of income. Why do some people make more money than others? The explanatory variables are likely to be things like education, years of work experience, and so on. In a large data set, researchers might also find that people with whiter teeth earn $86 more per year than other workers, ceteris paribus. (“Ceteris paribus” comes from the Latin meaning “other things being equal.”) The positive and statistically significant coefficient on the “white teeth” variable assumes that the individuals being compared are similar in other respects:
64%
Flag icon
This means (1) we’ve rejected the null hypothesis that really white teeth have no association with income with a high degree of confidence; and (2) if we analyze other data samples, we are likely to find a similar relationship between good-looking teeth and higher income.
65%
Flag icon
Significance.