Statistics for People Who (Think They) Hate Statistics
Rate it:
41%
Flag icon
In general, using various inferential tools, you may find differences between samples and populations, two samples or more, and so on, but the $64,000 question is not only whether that difference is (statistically) significant but also whether it is meaningful. That is, does enough of a separation exist between
41%
Flag icon
the distributions that represent each sample or group you test that the difference is really a difference worth discussing?
41%
Flag icon
Effect size is the strength of a relationship between variables. It can be a correlation coefficient, as we talked about in Chapter 5, but the relationship between variables can also be apparent in the size of a difference between groups. It could be an indication of how effective a pill or intervention is, right?— a measure of the magnitude of the treatment. So the difference between a group that got a treatment and the group that did not shows the relationship between the independent variable (the treatment) and the dependent variable. So effect sizes can be correlational values or values ...more
41%
Flag icon
Here’s the formula for computing Cohen’s d for the effect size for a one-sample z test:
41%
Flag icon
41%
Flag icon
What does this mean? One of the very cool things that Cohen (and others) figured out was just what a small, medium, and large effect size is.
41%
Flag icon
A small effect size ranges from 0 to .2. A medium effect size ranges from .2 to .8. A large effect size is any value above .8.
41%
Flag icon
With group comparisons, effect size gives us an idea about the relative positions of one group to another. For example, if the effect size is zero, that means that both groups tend to be very similar and overlap entirely—there is no difference between the two distributions of scores. On the other hand, an effect size of 1 means that the two groups overlap about 45% (having that much in common). And, as you might expect, as the effect size gets larger, it reflects an increasing distance, or lack of overlap, between the two groups.
42%
Flag icon
The t test is called independent because the two groups were not related in any way.
42%
Flag icon
The differences between the groups of Australian and Indian students are being explored. Participants are being tested only once. There are two groups. The appropriate test statistic is the t test for independent samples.
43%
Flag icon
Almost every statistical test has certain assumptions that underlie the use of the test. For example, the t test makes the major assumption that the amount of variability in each of the two groups is equal. This is the homogeneity of variance assumption. Homogeneity means sameness. Although it’s no big deal if this assumption is violated when the sample size is big enough, with smaller samples, one can’t be too sure of the results and conclusions.
43%
Flag icon
As we mentioned earlier, there are tons of statistical tests. The only inferential one that uses one sample that we cover in this book is the one-sample z test (see Chapter 10). But there is also the one-sample t test, which compares the mean score of a sample with another score, and sometimes that score is, indeed, the population mean, just as with the one-sample z test. In any case, you can use the one-sample z test or one-sample t test to test the same hypothesis, and you will reach the same conclusions (although you will be using different values and tables to do so).
43%
Flag icon
The difference between the means makes up the numerator, the top of the equation; the amount of variation within and between each of the two groups makes up the denominator.
43%
Flag icon
Remember, though, that because the test is nondirectional—the research hypothesis is that any difference exists—the sign of the difference is meaningless.
43%
Flag icon
When a nondirectional test is discussed, you may find that the t value is represented as an absolute value looking like this, |t| or t = |0.137|, which ignores the sign of the value altogether. Your teacher may even express the t value as such to emphasize that the sign is relevant for a one-directional test but not for a nondirectional one.
43%
Flag icon
5. Determine the value needed for rejection of the null hypothesis using the appropriate table of critical values for the particular statistic. Here’s where we go to Table B.2 in Appendix B, which lists the critical values for the t test.
43%
Flag icon
Our first task is to determine the degrees of freedom (df        ), which approximates the sample size (but, for fancy technical reasons, adjusts it slightly to make for a more accurate outcome). For this particular test statistic, the degrees of freedom is n1 − 1 + n2 − 1 or n1 + n2 − 2 (putting the terms in either order results in the same value). So for each group, add the size of the two samples and subtract 2. In this example, 30 + 30 − 2 = 58. This is the degrees of freedom for this particular application of the t test but not necessarily for any other.
43%
Flag icon
So How Do I Interpret t(58) = −0.14, p > .05? t represents the test statistic that was used. 58 is the number of degrees of freedom. −0.14 is the obtained value, calculated using the formula we showed you earlier in the chapter. p > .05 (the really important part of this little phrase) indicates that the probability is greater than 5% that on any one test of the null hypothesis, the two groups do not differ because of the way they were taught. Note that p > .05 can also appear as p = ns for nonsignificant.
44%
Flag icon
You learned in Chapters 5 and 10 that effect size is a measure of how strongly variables relate to one another—with group comparisons, it’s a measure of the magnitude of the difference. Kind of like, How big is big?
44%
Flag icon
As we showed you in Chapter 10, the most direct and simple way to compute effect size is to simply divide the difference between the means by any one of the standard deviations. Danger, Will Robinson—this does assume that the standard deviations (and the amount of variance) between groups are equal to one another.
44%
Flag icon
You saw from our guidelines in Chapter 10 (p. 196) that an effect size of .37 is categorized as medium. In addition to the difference between the two means being statistically significant, one might conclude that the difference also is meaningful in that the effect
44%
Flag icon
size is not negligible. Now, how meaningful you wish to make it in your interpretation of the results depends on many factors, including the context within which the research question is being asked.
44%
Flag icon
The grown-up formula for the effect size uses the “pooled standard deviation” in the denominator of the ES equation that you saw previously. The pooled standard deviation is the average of the standard deviation from Group 1 and the standard deviation from Group 2.
44%
Flag icon
Why not take the A train and just go right to http://www.uccs.edu/~lbecker/, where statistician Lee Becker from the University of California developed an effect size calculator? Or ditto for the one located at http://www.psychometrica.de/effect_size.html created by Drs. Wolfgang and Alexandra Lenhard? With these calculators, you just plug in the values, click Compute, and the program does the rest, as you see in Figure 11.2.
46%
Flag icon
A t test for dependent means indicates that a single group of the same subjects is being studied under two conditions.
46%
Flag icon
Primarily,
46%
Flag icon
the t test for dependent means was used because the same children were tested at two times, before the start of the 1-year program and at the end of the 1-year program, and the second set of score...
This highlight has been truncated due to consecutive passage length restrictions.
46%
Flag icon
The difference between the students’ scores on the pretest and on the posttest is the focus. Participants are being tested more than once. There are two groups of scores. The appropriate test statistic is the t test for dependent means.
46%
Flag icon
There’s another way that statisticians sometimes talk about dependent tests—as repeated measures. Dependent tests are often called “repeated measures” both because the measures are repeated across time or conditions or some factor and because they are repeated across the same cases, each case is a person or thing.
46%
Flag icon
The t test for dependent means involves a comparison of means from each group of scores and focuses on the differences between the scores.
46%
Flag icon
The research hypothesis is one-tailed and directional because it posits that the posttest score will be higher than the pretest score.
46%
Flag icon
Using the flowchart shown in Figure 12.1, we determined that the appropriate test is a t test for dependent means. It is not a t test for independent means because the groups are not independent of each other. In fact, they’re not groups of participants but groups of scores for the same participants. The groups are dependent on one another. Other names for the t test for dependent means are the t test for paired samples and the t test for correlated samples. You’ll see in Chapter 15 that there is a very close relationship between a test of the significance of the correlation between these two ...more
46%
Flag icon
Our first task is to determine the degrees of freedom (df), which approximates the sample size. For this particular test statistic, the degrees of freedom is n − 1, where n equals the number of
46%
Flag icon
pairs of observations, or 25 − 1 = 24. This is the degrees of freedom for this test statistic only and not necessarily for any other.
47%
Flag icon
The computation of the effect size for a test
47%
Flag icon
between dependent means is the formula and procedure for the computation of the effect size for a test of the difference between independent means.
47%
Flag icon
You’ve just learned how to compare data from independent (Chapter 11) and dependent (Chapter 12) groups, and now it’s time to move on to another class of significance tests that deals with more than two groups (be they independent or dependent). This class of techniques, called analysis of variance, is very powerful and popular and will be a valuable tool in your war chest!
48%
Flag icon
As part of their research, they used a simple analysis of variance (or ANOVA) to test the hypothesis that number of years of experience in sports is related to coping skill (or an athlete’s score on the Athletic Coping Skills Inventory). ANOVA was used because more than two levels of the same variable were being tested, and these groups were compared on their average performance. (When you compare means for more than two groups, analysis of variance is the procedure to use.) In particular, Group 1 included athletes with 6 years of experience or fewer, Group 2 included athletes with 7 to 10 ...more
48%
Flag icon
We are testing for a difference between scores of different groups, in this case, the difference between the pressure felt by athletes. The athletes are being tested just once, not being tested more than once. There are three groups (fewer than 6 years, 7–10 years, and more than 10 years of experience). The appropriate test statistic is simple analysis of variance. (By the way, we call one-way analysis of variance “simple” because there is only one way in which the participants are grouped and compared.)
48%
Flag icon
ANOVA comes in many flavors. The simplest kind, and the focus of this chapter, is the simple analysis of variance, used when
48%
Flag icon
one factor or one independent variable (such as group membership) is being explored and this factor has more than two levels. Simple ANOVA is also called one-way analysis of variance because there is only one grouping factor. The technique is called analysis of variance because the variance due to differences in performance is separated into (a) variance that’s due to differences between individuals between groups and (b) variance due to differences within groups. The between-groups variance is assumed to be due to treatment differences, while the within-group variance is due to differences ...more
This highlight has been truncated due to consecutive passage length restrictions.
48%
Flag icon
or between-groups factor. Language development is the dependent variable (or outcome variable). The experimental design will look something like this, with three levels...
This highlight has been truncated due to consecutive passage length restrictions.
48%
Flag icon
The more complex type of ANOVA, factorial design, is used to explore more than one independent variable. Here’s an example where the effect of number of hours of preschool participation is being examined, but the effects of gender differences are b...
This highlight has been truncated due to consecutive passage length restrictions.
49%
Flag icon
Simple ANOVA must be used in any analysis where there is only one dimension or treatment, there are more than two levels of the grouping factor, and one is looking at mean differences across groups.
49%
Flag icon
The ANOVA formula (which is a ratio) compares the amount of variability between groups (which is due to the grouping factor) with the amount of variability within groups (which is due to chance). If that ratio is 1, then the amount of variability due to within-group differences is equal to the amount of variability due to between-groups differences, and any difference between groups is not significant. As the average difference between groups gets larger (and the numerator of
49%
Flag icon
the ratio increases in value), the F value increases as well. As the F value increases, it becomes more extreme in relation to the expected distribution of all F values and is more likely due to something other than chance.
49%
Flag icon
ANOVA, also called the F test (because it produces an F statistic or an F ratio), looks for an overall difference among groups.
49%
Flag icon
Note that this test does not look at specific pairs of means (pairwise differences), such as the difference between Group 1 and Group 2.
49%
Flag icon
The F ratio is a ratio of the variability between groups to the variability within groups. To compute these values, we first have to compute what is called the sum of squares for each source of variability—between groups, within groups, and the total. The between-groups sum of squares is equal to the sum of the differences between the mean of all scores and the mean of each group’s score, which is then squared. This gives us an idea of how different each group’s mean is from the overall mean. The within-group sum of squares is equal to the sum of the differences between each individual score ...more
49%
Flag icon
Up to now, we’ve talked about one- and two-tailed tests. There’s no such thing when talking about ANOVA! Because more than two groups are being tested, and because the F test is an omnibus test (how’s that for a cool word?), meaning that ANOVA of any flavor tests for an overall difference between means, talking about the specific direction of differences does not make any sense.