Goodreads helps you follow your favorite authors. Be the first to learn about new releases!
Start by following Evan M. Berman.

Evan M. Berman Evan M. Berman > Quotes

 

 (?)
Quotes are added by the Goodreads community and are not verified by Goodreads. (Learn more)
Showing 1-30 of 53
“The T-Test   CHAPTER OBJECTIVES After reading this chapter, you should be able to Test whether two or more groups have different means of a continuous variable Assess whether the mean is consistent with a specified value Evaluate whether variables meet test”
Evan M. Berman, Essential Statistics for Public Managers and Policy Analysts
“Note: The median survival time is 5.19. Survival analysis can also examine survival rates for different “treatments” or conditions. Assume that data are available about the number of dependents that each client has. Table 18.3 is readily produced for each subset of this condition. For example, by comparing the survival rates of those with and those without dependents, the probability density figure, which shows the likelihood of an event occurring, can be obtained (Figure 18.5). This figure suggests that having dependents is associated with clients’ finding employment somewhat faster. Beyond Life Tables Life tables require that the interval (time) variable be measured on a discrete scale. When the time variable is continuous, Kaplan-Meier survival analysis is used. This procedure is quite analogous to life tables analysis. Cox regression is similar to Kaplan-Meier but allows for consideration of a larger number of independent variables (called covariates). In all instances, the purpose is to examine the effect of treatment on the survival of observations, that is, the occurrence of a dichotomous event. Figure 18.5 Probability Density FACTOR ANALYSIS A variety of statistical techniques help analysts to explore relationships in their data. These exploratory techniques typically aim to create groups of variables (or observations) that are related to each”
Evan M. Berman, Essential Statistics for Public Managers and Policy Analysts
“observation is simply an observation for which a specified outcome has not yet occurred. Assume that data exist from a random sample of 100 clients who are seeking, or have found, employment. Survival analysis is the statistical procedure for analyzing these data. The name of this procedure stems from its use in medical research. In clinical trials, researchers want to know the survival (or disease) rate of patients as a function of the duration of their treatment. For patients in the middle of their trial, the specified outcome may not have occurred yet. We obtain the following results (also called a life table) from analyzing hypothetical data from welfare records (see Table 18.3). In the context shown in the table, the word terminal signifies that the event has occurred. That is, the client has found employment. At start time zero, 100 cases enter the interval. During the first period, there are no terminal cases and nine censored cases. Thus, 91 cases enter the next period. In this second period, 2 clients find employment and 14 do not, resulting in 75 cases that enter the following period. The column labeled “Cumulative proportion surviving until end of interval” is an estimate of probability of surviving (not finding employment) until the end of the stated interval.5 The column labeled “Probability density” is an estimate of the probability of the terminal event occurring (that is, finding employment) during the time interval. The results also report that “the median survival time is 5.19.” That is, half of the clients find employment in 5.19 weeks. Table 18.2 Censored Observations Note: Obs = observations (clients); Emp = employment; 0 = has not yet found employment; 1 = has found employment. Table 18.3 Life Table Results”
Evan M. Berman, Essential Statistics for Public Managers and Policy Analysts
“second variable, we find that Z = 2.103, p = .035. This value is larger than that obtained by the parametric test, p = .019.21 SUMMARY When analysts need to determine whether two groups have different means of a continuous variable, the t-test is the tool of choice. This situation arises, for example, when analysts compare measurements at two points in time or the responses of two different groups. There are three common t-tests, involving independent samples, dependent (paired) samples, and the one-sample t-test. T-tests are parametric tests, which means that variables in these tests must meet certain assumptions, notably that they are normally distributed. The requirement of normally distributed variables follows from how parametric tests make inferences. Specifically, t-tests have four assumptions: One variable is continuous, and the other variable is dichotomous.”
Evan M. Berman, Essential Statistics for Public Managers and Policy Analysts
“others seek to create and predict classifications through independent variables. Table 18.4 Factor Analysis Note: Factor analysis with Varimax rotation. Source: E. Berman and J. West. (2003). “What Is Managerial Mediocrity? Definition, Prevalence and Negative Impact (Part 1).” Public Performance & Management Review, 27 (December): 7–27. Multidimensional scaling and cluster analysis aim to identify key dimensions along which observations (rather than variables) differ. These techniques differ from factor analysis in that they allow for a hierarchy of classification dimensions. Some also use graphics to aid in visualizing the extent of differences and to help in identifying the similarity or dissimilarity of observations. Network analysis is a descriptive technique used to portray relationships among actors. A graphic representation can be made of the frequency with which actors interact with each other, distinguishing frequent interactions from those that are infrequent. Discriminant analysis is used when the dependent variable is nominal with two or more categories. For example, we might want to know how parents choose among three types of school vouchers. Discriminant analysis calculates regression lines that distinguish (discriminate) among the nominal groups (the categories of the dependent variable), as well as other”
Evan M. Berman, Essential Statistics for Public Managers and Policy Analysts
“For comparison, we use the Mann-Whitney test to compare the two samples of 10th graders discussed earlier in this chapter. The sum of ranks for the “before” group is 69.55, and for the “one year later group,” 86.57. The test statistic is significant at p = .019, yielding the same conclusion as the independent-samples t-test, p = .011. This comparison also shows that nonparametric tests do have higher levels of significance. As mentioned earlier, the Mann-Whitney test (as a nonparametric test) does not calculate the group means; separate, descriptive analysis needs to be undertaken for that information. A nonparametric alternative to the paired-samples t-test is the Wilcoxon signed rank test. This test assigns ranks based on the absolute values of these differences (Table 12.5). The signs of the differences are retained (thus, some values are positive and others are negative). For the data in Table 12.5, there are seven positive ranks (with mean rank = 6.57) and three negative ranks (with mean rank = 3.00). The Wilcoxon signed rank test statistic is normally distributed. The Wilcoxon signed rank test statistic, Z, for a difference between these values is 1.89 (p = .059 > .05). Hence, according to this test, the differences between the before and after scores are not significant. Getting Started Calculate a t-test and a Mann-Whitney test on data of your choice. Again, nonparametric tests result in larger p-values. The paired-samples t-test finds that p = .038 < .05, providing sufficient statistical evidence to conclude that the differences are significant. It might also be noted that a doubling of the data in Table 12.5 results in finding a significant difference between the before and after scores with the Wilcoxon signed rank test, Z = 2.694, p = .007. Table 12.5 Wilcoxon Signed Rank Test The Wilcoxon signed rank test can also be adapted as a nonparametric alternative to the one-sample t-test. In that case, analysts create a second variable that, for each observation, is the test value. For example, if in Table 12.5 we wish to test whether the mean of variable “before” is different from, say, 4.0, we create a second variable with 10 observations for which each value is, say, 4.0. Then using the Wilcoxon signed rank test for the “before” variable and this new,”
Evan M. Berman, Essential Statistics for Public Managers and Policy Analysts
“safety at the beginning of the program was 4.40 (standard deviation, SD = 1.00), and one year later, 4.80 (SD = 0.94). The mean safety score increased among 10th graders, but is the increase statistically significant? Among other concerns is that the standard deviations are considerable for both samples. As part of the analysis, we conduct a t-test to answer the question of whether the means of these two distributions are significantly different. First, we examine whether test assumptions are met. The samples are independent, and the variables meet the requirement that one is continuous (the index variable) and the other dichotomous. The assumption of equality of variances is answered as part of conducting the t-test, and so the remaining question is whether the variables are normally distributed. The distributions are shown in the histograms in Figure 12.3.12 Are these normal distributions? Visually, they are not the textbook ideal—real-life data seldom are. The Kolmogorov-Smirnov tests for both distributions are insignificant (both p > .05). Hence, we conclude that the two distributions can be considered normal. Having satisfied these t-test assumptions, we next conduct the t-test for two independent samples. Table 12.1 shows the t-test results. The top part of Table 12.1 shows the descriptive statistics, and the bottom part reports the test statistics. Recall that the t-test is a two-step test. We first test whether variances are equal. This is shown as the “Levene’s test for equality of variances.” The null hypothesis of the Levene’s test is that variances are equal; this is rejected when the p-value of this Levene’s test statistic is less than .05. The Levene’s test uses an F-test statistic (discussed in Chapters 13 and 15), which, other than its p-value, need not concern us here. In Table 12.1, the level of significance is .675, which exceeds .05. Hence, we accept the null hypothesis—the variances of the two distributions shown in Figure 12.3 are equal. Figure 12.3 Perception of High School Safety among 10th Graders Table 12.1 Independent-Samples T-Test: Output Note: SD = standard deviation. Now we go to the second step, the main purpose. Are the two means (4.40 and 4.80)”
Evan M. Berman, Essential Statistics for Public Managers and Policy Analysts
“second variable, we find that Z = 2.103, p = .035. This value is larger than that obtained by the parametric test, p = .019.21 SUMMARY When analysts need to determine whether two groups have different means of a continuous variable, the t-test is the tool of choice. This situation arises, for example, when analysts compare measurements at two points in time or the responses of two different groups. There are three common t-tests, involving independent samples, dependent (paired) samples, and the one-sample t-test. T-tests are parametric tests, which means that variables in these tests must meet certain assumptions, notably that they are normally distributed. The requirement of normally distributed variables follows from how parametric tests make inferences. Specifically, t-tests have four assumptions: One variable is continuous, and the other variable is dichotomous. The two distributions have equal variances. The observations are independent. The two distributions are normally distributed. The assumption of homogeneous variances does not apply to dependent-samples and one-sample t-tests because both are based on only a single variable for testing significance. When assumptions of normality are not met, variable transformation may be used. The search for alternative ways for dealing with normality problems may lead analysts to consider nonparametric alternatives. The chief advantage of nonparametric tests is that they do not require continuous variables to be normally distributed. The chief disadvantage is that they yield higher levels of statistical significance, making it less likely that the null hypothesis may be rejected. A nonparametric alternative for the independent-samples t-test is the Mann-Whitney test, and the nonparametric alternative for the dependent-samples t-test is the Wilcoxon”
Evan M. Berman, Essential Statistics for Public Managers and Policy Analysts
“The test statistics of a t-test can be positive or negative, although this depends merely on which group has the larger mean; the sign of the test statistic has no substantive interpretation. Critical values (see Chapter 10) of the t-test are shown in Appendix C as (Student’s) t-distribution.4 For this test, the degrees of freedom are defined as n – 1, where n is the total number of observations for both groups. The table is easy to use. As mentioned below, most tests are two-tailed tests, and analysts find critical values in the columns for the .05 (5 percent) and .01 (1 percent) levels of significance. For example, the critical value at the 1 percent level of significance for a test based on 25”
Evan M. Berman, Essential Statistics for Public Managers and Policy Analysts
“suffered greater wetland loss than watersheds with smaller surrounding populations. Most watersheds have suffered no or only very modest losses (less than 3 percent during the decade in question), and few watersheds have suffered more than a 4 percent loss. The distribution is thus heavily skewed toward watersheds with little wetland losses (that is, to the left) and is clearly not normally distributed.6 To increase normality, the variable is transformed by twice taking the square root, x.25. The transformed variable is then normally distributed: the Kolmogorov-Smirnov statistic is 0.82 (p = .51 > .05). The variable also appears visually normal for each of the population subgroups. There are four population groups, designed to ensure an adequate number of observations in each. Boxplot analysis of the transformed variable indicates four large and three small outliers (not shown). Examination suggests that these are plausible and representative values, which are therefore retained. Later, however, we will examine the effect of these seven observations on the robustness of statistical results. Descriptive analysis of the variables is shown in Table 13.1. Generally, large populations tend to have larger average wetland losses, but the standard deviations are large relative to (the difference between) these means, raising considerable question as to whether these differences are indeed statistically significant. Also, the untransformed variable shows that the mean wetland loss is less among watersheds with “Medium I” populations than in those with “Small” populations (1.77 versus 2.52). The transformed variable shows the opposite order (1.06 versus 0.97). Further investigation shows this to be the effect of the three small outliers and two large outliers on the calculation of the mean of the untransformed variable in the “Small” group. Variable transformation minimizes this effect. These outliers also increase the standard deviation of the “Small” group. Using ANOVA, we find that the transformed variable has unequal variances across the four groups (Levene’s statistic = 2.83, p = .41 < .05). Visual inspection, shown in Figure 13.2, indicates that differences are not substantial for observations within the group interquartile ranges, the areas indicated by the boxes. The differences seem mostly caused by observations located in the whiskers of the “Small” group, which include the five outliers mentioned earlier. (The other two outliers remain outliers and are shown.) For now, we conclude that no substantial differences in variances exist, but we later test the robustness of this conclusion with consideration of these observations (see Figure 13.2). Table 13.1 Variable Transformation We now proceed with the ANOVA analysis. First, Table 13.2 shows that the global F-test statistic is 2.91, p = .038 < .05. Thus, at least one pair of means is significantly different. (The term sum of squares is explained in note 1.) Getting Started Try ANOVA on some data of your choice. Second, which pairs are significantly different? We use the Bonferroni post-hoc test because relatively few comparisons are made (there are only four groups). The computer-generated results (not shown in Table 13.2) indicate that the only significant difference concerns the means of the “Small” and “Large” groups. This difference (1.26 - 0.97 = 0.29 [of transformed values]) is significant at the 5 percent level (p = .028). The Tukey and Scheffe tests lead to the same conclusion (respectively, p = .024 and .044). (It should be noted that post-hoc tests also exist for when equal variances are not assumed. In our example, these tests lead to the same result.7) This result is consistent with a visual reexamination of Figure 13.2, which shows that differences between group means are indeed small. The Tukey and”
Evan M. Berman, Essential Statistics for Public Managers and Policy Analysts
“Assume that a welfare manager in our earlier example (see discussion of path analysis) takes a snapshot of the status of the welfare clients. Some clients may have obtained employment and others not yet. Clients will also vary as to the amount of time that they have been receiving welfare. Examine the data in Table 18.2. It shows that neither of the two clients, who have yet to complete their first week on welfare, has found employment; one of the three clients who have completed one week of welfare has found employment. Censored observations are observations for which the specified outcome has yet to occur. It is assumed that all clients who have not yet found employment are still waiting for this event to occur. Thus, the sample should not include clients who are not seeking employment. Note, however, that a censored observation is very different from one that has missing data, which might occur because the manager does not know whether the client has found employment. As with regression, records with missing data are excluded from analysis. A censored”
Evan M. Berman, Essential Statistics for Public Managers and Policy Analysts
“A NONPARAMETRIC ALTERNATIVE A nonparametric alternative to one-way ANOVA is Kruskal-Wallis’ H test of one-way ANOVA. Instead of using the actual values of the variables, Kruskal-Wallis’ H test assigns ranks to the variables, as shown in Chapter 11. As a nonparametric method, Kruskal-Wallis’ H test does not assume normal populations, but the test does assume similarly shaped distributions for each group. This test is applied readily to our one-way ANOVA example, and the results are shown in Table 13.5. Table 13.5 Kruskal-Wallis’ H-Test of One-Way ANOVA Kruskal-Wallis’ H one-way ANOVA test shows that population is significantly associated with watershed loss (p = .013). This is one instance in which the general rule that nonparametric tests have higher levels of significance is not seen. Although Kruskal-Wallis’ H test does not report mean values of the dependent variable, the pattern of mean ranks is consistent with Figure 13.2. A limitation of this nonparametric test is that it does not provide post-hoc tests or analysis of homogeneous groups, nor are there nonparametric n-way ANOVA tests such as for the two-way ANOVA test described earlier. SUMMARY One-way ANOVA extends the t-test by allowing analysts to test whether two or more groups have different means of a continuous variable. The t-test is limited to only two groups. One-way ANOVA can be used, for example, when analysts want to know if the mean of a variable varies across regions, racial or ethnic groups, population or employee categories, or another grouping with multiple categories. ANOVA is family of statistical techniques, and one-way ANOVA is the most basic of these methods. ANOVA is a parametric test that makes the following assumptions: The dependent variable is continuous. The independent variable is ordinal or nominal. The groups have equal variances. The variable is normally distributed in each of the groups. Relative to the t-test, ANOVA requires more attention to the assumptions of normality and homogeneity. ANOVA is not robust for the presence of outliers, and it appears to be less robust than the t-test for deviations from normality. Variable transformations and the removal of outliers are to be expected when using ANOVA. ANOVA also includes three other types of tests of interest: post-hoc tests of mean differences among categories, tests of homogeneous subsets, and tests for the linearity of mean differences across categories. Two-way ANOVA addresses the effect of two independent variables on a continuous dependent variable. When using two-way ANOVA, the analyst is able to distinguish main effects from interaction effects. Kruskal-Wallis’ H test is a nonparametric alternative to one-way ANOVA. KEY TERMS   Analysis of variance (ANOVA) ANOVA assumptions Covariates Factors Global F-test Homogeneous subsets Interaction effect Kruskal-Wallis’ H test of one-way ANOVA Main effect One-way ANOVA Post-hoc test Two-way ANOVA Notes   1. The between-group variance is”
Evan M. Berman, Essential Statistics for Public Managers and Policy Analysts
“Scheffe tests also produce “homogeneous subsets,” that is, groups that have statistically identical means. Both the three largest and the three smallest populations have identical means. The Tukey levels of statistical significance are, respectively, .725 and .165 (both > .05). This is shown in Table 13.3. Figure 13.2 Group Boxplots Table 13.2 ANOVA Table Third, is the increase in means linear? This test is an option on many statistical software packages that produces an additional line of output in the ANOVA table, called the “linear term for unweighted sum of squares,” with the appropriate F-test. Here, that F-test statistic is 7.85, p = .006 < .01, and so we conclude that the apparent linear increase is indeed significant: wetland loss is linearly associated with the increased surrounding population of watersheds.8 Figure 13.2 does not clearly show this, but the enlarged Y-axis in Figure 13.3 does. Fourth, are our findings robust? One concern is that the statistical validity is affected by observations that statistically (although not substantively) are outliers. Removing the seven outliers identified earlier does not affect our conclusions. The resulting variable remains normally distributed, and there are no (new) outliers for any group. The resulting variable has equal variances across the groups (Levene’s test = 1.03, p = .38 > .05). The global F-test is 3.44 (p = .019 < .05), and the Bonferroni post-hoc test similarly finds that only the differences between the “Small” and “Large” group means are significant (p = .031). The increase remains linear (F = 6.74, p = .011 < .05). Thus, we conclude that the presence of observations with large values does not alter our conclusions. Table 13.3 Homogeneous Subsets Figure 13.3 Watershed Loss, by Population We also test the robustness of conclusions for different variable transformations. The extreme skewness of the untransformed variable allows for only a limited range of root transformations that produce normality. Within this range (power 0.222 through 0.275), the preceding conclusions are replicated fully. Natural log and base-10 log transformations also result in normality and replicate these results, except that the post-hoc tests fail to identify that the means of the “Large” and “Small” groups are significantly different. However, the global F-test is (marginally) significant (F = 2.80, p = .043 < .05), which suggests that this difference is too small to detect with this transformation. A single, independent-samples t-test for this difference is significant (t = 2.47, p = .017 < .05), suggesting that this problem may have been exacerbated by the limited number of observations. In sum, we find converging evidence for our conclusions. As this example also shows, when using statistics, analysts frequently must exercise judgment and justify their decisions.9 Finally, what is the practical significance of this analysis? The wetland loss among watersheds with large surrounding populations is [(3.21 – 2.52)/2.52 =] 27.4 percent greater than among those surrounded by small populations. It is up to managers and elected officials to determine whether a difference of this magnitude warrants intervention in watersheds with large surrounding populations.10”
Evan M. Berman, Essential Statistics for Public Managers and Policy Analysts
“significantly different? Because the variances are equal, we read the t-test statistics from the top line, which states “equal variances assumed.” (If variances had been unequal, then we would read the test statistics from the second line, “equal variances not assumed.”). The t-test statistic for equal variances for this test is 2.576, which is significant at p = .011.13 Thus, we conclude that the means are significantly different; the 10th graders report feeling safer one year after the anger management program was implemented. Working Example 2 In the preceding example, the variables were both normally distributed, but this is not always the case. Many variables are highly skewed and not normally distributed. Consider another example. The U.S. Environmental Protection Agency (EPA) collects information about the water quality of watersheds, including information about the sources and nature of pollution. One such measure is the percentage of samples that exceed pollution limits for ammonia, dissolved oxygen, phosphorus, and pH.14 A manager wants to know whether watersheds in the East have higher levels of pollution than those in the Midwest. Figure 12.4 Untransformed Variable: Watershed Pollution An index variable of such pollution is constructed. The index variable is called “pollution,” and the first step is to examine it for test assumptions. Analysis indicates that the range of this variable has a low value of 0.00 percent and a high value of 59.17 percent. These are plausible values (any value above 100.00 percent is implausible). A boxplot (not shown) demonstrates that the variable has two values greater than 50.00 percent that are indicated as outliers for the Midwest region. However, the histograms shown in Figure 12.4 do not suggest that these values are unusually large; rather, the peak in both histograms is located off to the left. The distributions are heavily skewed.15 Because the samples each have fewer than 50 observations, the Shapiro-Wilk test for normality is used. The respective test statistics for East and Midwest are .969 (p = .355) and .931 (p = .007). Visual inspection confirms that the Midwest distribution is indeed nonnormal. The Shapiro-Wilk test statistics are given only for completeness; they have no substantive interpretation. We must now either transform the variable so that it becomes normal for purposes of testing, or use a nonparametric alternative. The second option is discussed later in this chapter. We also show the consequences of ignoring the problem. To transform the variable, we try the recommended transformations, , and then examine the transformed variable for normality. If none of these transformations work, we might modify them, such as using x⅓ instead of x½ (recall that the latter is ).16 Thus, some experimentation is required. In our case, we find that the x½ works. The new Shapiro-Wilk test statistics for East and Midwest are, respectively, .969 (p = .361) and .987 (p = .883). Visual inspection of Figure 12.5 shows these two distributions to be quite normal, indeed. Figure 12.5 Transformed Variable: Watershed Pollution The results of the t-test for the transformed variable are shown in Table”
Evan M. Berman, Essential Statistics for Public Managers and Policy Analysts
“Table 14.1 also shows R-square (R2), which is called the coefficient of determination. R-square is of great interest: its value is interpreted as the percentage of variation in the dependent variable that is explained by the independent variable. R-square varies from zero to one, and is called a goodness-of-fit measure.5 In our example, teamwork explains only 7.4 percent of the variation in productivity. Although teamwork is significantly associated with productivity, it is quite likely that other factors also affect it. It is conceivable that other factors might be more strongly associated with productivity and that, when controlled for other factors, teamwork is no longer significant. Typically, values of R2 below 0.20 are considered to indicate weak relationships, those between 0.20 and 0.40 indicate moderate relationships, and those above 0.40 indicate strong relationships. Values of R2 above 0.65 are considered to indicate very strong relationships. R is called the multiple correlation coefficient and is always 0 ≤ R ≤ 1. To summarize up to this point, simple regression provides three critically important pieces of information about bivariate relationships involving two continuous variables: (1) the level of significance at which two variables are associated, if at all (t-statistic), (2) whether the relationship between the two variables is positive or negative (b), and (3) the strength of the relationship (R2). Key Point R-square is a measure of the strength of the relationship. Its value goes from 0 to 1. The primary purpose of regression analysis is hypothesis testing, not prediction. In our example, the regression model is used to test the hypothesis that teamwork is related to productivity. However, if the analyst wants to predict the variable “productivity,” the regression output also shows the SEE, or the standard error of the estimate (see Table 14.1). This is a measure of the spread of y values around the regression line as calculated for the mean value of the independent variable, only, and assuming a large sample. The standard error of the estimate has an interpretation in terms of the normal curve, that is, 68 percent of y values lie within one standard error from the calculated value of y, as calculated for the mean value of x using the preceding regression model. Thus, if the mean index value of the variable “teamwork” is 5.0, then the calculated (or predicted) value of “productivity” is [4.026 + 0.223*5 =] 5.141. Because SEE = 0.825, it follows that 68 percent of productivity values will lie 60.825 from 5.141 when “teamwork” = 5. Predictions of y for other values of x have larger standard errors.6 Assumptions and Notation There are three simple regression assumptions. First, simple regression assumes that the relationship between two variables is linear. The linearity of bivariate relationships is easily determined through visual inspection, as shown in Figure 14.2. In fact, all analysis of relationships involving continuous variables should begin with a scatterplot. When variable”
Evan M. Berman, Essential Statistics for Public Managers and Policy Analysts
“understanding of this formula, it is useful to relate it to the discussion of hypothesis testing in Chapter 10. First, note that the difference of means, appears in the numerator: the larger the difference of means, the larger the t-test test statistic, and the more likely we might reject the null hypothesis. Second, sp is the pooled variance of the two groups, that is, the weighted average of the variances of each group.3 Increases in the standard deviation decrease the test statistic. Thus, it is easier to reject the null hypotheses when two populations are clustered narrowly around their means than when they are spread widely around them. Finally, more observations (that is, increased information or larger n1 and n2) increase the size of the test statistic, making it easier to reject the null hypothesis. Figure 12.1 The T-Test: Mean Incomes by Gender”
Evan M. Berman, Essential Statistics for Public Managers and Policy Analysts
“Chapter 3), the resulting index variables typically are continuous as well. When variables are continuous, we should not recode them as categorical variables just to use the techniques of the previous chapters. Continuous variables provide valuable information about distances between categories and often have a broader range of values than ordinal variables. Recoding continuous variables as categorical variables is discouraged because it results in a loss of information; we should use tests such as the t-test. Statistics involving continuous variables usually require more test assumptions. Many of these tests are referred to as parametric statistics; this term refers to the fact that they make assumptions about the distribution of data and also that they are used to make inferences about population parameters. Formally, the term parametric means that a test makes assumptions about the distribution of the underlying population. Parametric tests have more test assumptions than nonparametric tests, most typically that the variable is”
Evan M. Berman, Essential Statistics for Public Managers and Policy Analysts
“categorical and the dependent variable is continuous. The logic of this approach is shown graphically in Figure 13.1. The overall group mean is (the mean of means). The boxplots represent the scores of observations within each group. (As before, the horizontal lines indicate means, rather than medians.) Recall that variance is a measure of dispersion. In both parts of the figure, w is the within-group variance, and b is the between-group variance. Each graph has three within-group variances and three between-group variances, although only one of each is shown. Note in part A that the between-group variances are larger than the within-group variances, which results in a large F-test statistic using the above formula, making it easier to reject the null hypothesis. Conversely, in part B the within-group variances are larger than the between-group variances, causing a smaller F-test statistic and making it more difficult to reject the null hypothesis. The hypotheses are written as follows: H0: No differences between any of the group means exist in the population. HA: At least one difference between group means exists in the population. Note how the alternate hypothesis is phrased, because the logical opposite of “no differences between any of the group means” is that at least one pair of means differs. H0 is also called the global F-test because it tests for differences among any means. The formulas for calculating the between-group variances and within-group variances are quite cumbersome for all but the simplest of designs.1 In any event, statistical software calculates the F-test statistic and reports the level at which it is significant.2 When the preceding null hypothesis is rejected, analysts will also want to know which differences are significant. For example, analysts will want to know which pairs of differences in watershed pollution are significant across regions. Although one approach might be to use the t-test to sequentially test each pair of differences, this should not be done. It would not only be a most tedious undertaking but would also inadvertently and adversely affect the level of significance: the chance of finding a significant pair by chance alone increases as more pairs are examined. Specifically, the probability of rejecting the null hypothesis in one of two tests is [1 – 0.952 =] .098, the probability of rejecting it in one of three tests is [1 – 0.953 =] .143, and so forth. Thus, sequential testing of differences does not reflect the true level of significance for such tests and should not be used. Post-hoc tests test all possible group differences and yet maintain the true level of significance. Post-hoc tests vary in their methods of calculating test statistics and holding experiment-wide error rates constant. Three popular post-hoc tests are the Tukey, Bonferroni, and Scheffe tests.”
Evan M. Berman, Essential Statistics for Public Managers and Policy Analysts
“T-TESTS FOR INDEPENDENT SAMPLES T-tests are used to test whether the means of a continuous variable differ across two different groups. For example, do men and women differ in their levels of income, when measured as a continuous variable? Does crime vary between two parts of town? Do rich people live longer than poor people? Do high-performing students commit fewer acts of violence than do low-performing students? The t-test approach is shown graphically in Figure 12.1, which illustrates the incomes of men and women as boxplots (the lines in the middle of the boxes indicate the means rather than the medians).2 When the two groups are independent samples, the t-test is called the independent-samples t-test. Sometimes the continuous variable is called a “test variable” and the dichotomous variable is called a “grouping variable.” The t-test tests whether the difference of the means is significantly different from zero, that is, whether men and women have different incomes. The following”
Evan M. Berman, Essential Statistics for Public Managers and Policy Analysts
“SUMMARY A vast array of additional statistical methods exists. In this concluding chapter, we summarized some of these methods (path analysis, survival analysis, and factor analysis) and briefly mentioned other related techniques. This chapter can help managers and analysts become familiar with these additional techniques and increase their access to research literature in which these techniques are used. Managers and analysts who would like more information about these techniques will likely consult other texts or on-line sources. In many instances, managers will need only simple approaches to calculate the means of their variables, produce a few good graphs that tell the story, make simple forecasts, and test for significant differences among a few groups. Why, then, bother with these more advanced techniques? They are part of the analytical world in which managers operate. Through research and consulting, managers cannot help but come in contact with them. It is hoped that this chapter whets the appetite and provides a useful reference for managers and students alike. KEY TERMS   Endogenous variables Exogenous variables Factor analysis Indirect effects Loading Path analysis Recursive models Survival analysis Notes 1. Two types of feedback loops are illustrated as follows: 2. When feedback loops are present, error terms for the different models will be correlated with exogenous variables, violating an error term assumption for such models. Then, alternative estimation methodologies are necessary, such as two-stage least squares and others discussed later in this chapter. 3. Some models may show double-headed arrows among error terms. These show the correlation between error terms, which is of no importance in estimating the beta coefficients. 4. In SPSS, survival analysis is available through the add-on module in SPSS Advanced Models. 5. The functions used to estimate probabilities are rather complex. They are so-called Weibull distributions, which are defined as h(t) = αλ(λt)a–1, where a and 1 are chosen to best fit the data. 6. Hence, the SSL is greater than the squared loadings reported. For example, because the loadings of variables in groups B and C are not shown for factor 1, the SSL of shown loadings is 3.27 rather than the reported 4.084. If one assumes the other loadings are each .25, then the SSL of the not reported loadings is [12*.252 =] .75, bringing the SSL of factor 1 to [3.27 + .75 =] 4.02, which is very close to the 4.084 value reported in the table. 7. Readers who are interested in multinomial logistic regression can consult on-line sources or the SPSS manual, Regression Models 10.0 or higher. The statistics of discriminant analysis are very dissimilar from those of logistic regression, and readers are advised to consult a separate text on that topic. Discriminant analysis is not often used in public”
Evan M. Berman, Essential Statistics for Public Managers and Policy Analysts
“violations of regression assumptions, and strategies for examining and remedying such assumptions. Then we extend the preceding discussion and will be able to conclude whether the above results are valid. Again, this model is not the only model that can be constructed but rather is one among a family of plausible models. Indeed, from a theoretical perspective, other variables might have been included, too. From an empirical perspective, perhaps other variables might explain more variance. Model specification is a judicious effort, requiring a balance between theoretical and statistical integrity. Statistical software programs can also automatically select independent variables based on their statistical significance, hence, adding to R-square.2 However, models with high R-square values are not necessarily better; theoretical reasons must exist for selecting independent variables, explaining why and how they might be related to the dependent variable. Knowing which variables are related empirically to the dependent variable can help narrow the selection, but such knowledge should not wholly determine it. We now turn to a discussion of the other statistics shown in Table 15.1. Getting Started Find examples of multiple regression in the research literature. Figure 15.1 Dependent Variable: Productivity FURTHER STATISTICS Goodness of Fit for Multiple Regression The model R-square in Table 15.1 is greatly increased over that shown in Table 14.1: R-square has gone from 0.074 in the simple regression model to 0.274. However, R-square has the undesirable mathematical property of increasing with the number of independent variables in the model. R-square increases regardless of whether an additional independent variable adds further explanation of the dependent variable. The adjusted R-square (or ) controls for the number of independent variables. is always equal to or less than R2. The above increase in explanation of the dependent variable is due to variables identified as statistically significant in Table 15.1. Key Point R-square is the variation in the dependent variable that is explained by all the independent variables. Adjusted R-square is often used to evaluate model explanation (or fit). Analogous with simple regression, values of below 0.20 are considered to suggest weak model fit, those between 0.20 and 0.40 indicate moderate fit, those above 0.40 indicate strong fit, and those above 0.65 indicate very strong model fit. Analysts should remember that choices of model specification are driven foremost by theory, not statistical model fit; strong model fit is desirable only when the variables, and their relationships, are meaningful in some real-life sense. Adjusted R-square can assist in the variable selection process. Low values of adjusted R-square prompt analysts to ask whether they inadvertently excluded important variables from their models; if included, these variables might affect the statistical significance of those already in a model.3 Adjusted R-square also helps analysts to choose among alternative variable specifications (for example, different measures of student isolation), when such choices are no longer meaningfully informed by theory. Empirical issues of model fit then usefully guide the selection process further. Researchers typically report adjusted R-square with their”
Evan M. Berman, Essential Statistics for Public Managers and Policy Analysts
“Beyond One-Way ANOVA The approach described in the preceding section is called one-way ANOVA. This scenario is easily generalized to accommodate more than one independent variable. These independent variables are either discrete (called factors) or continuous (called covariates). These approaches are called n-way ANOVA or ANCOVA (the “C” indicates the presence of covariates). Two way ANOVA, for example, allows for testing of the effect of two different independent variables on the dependent variable, as well as the interaction of these two independent variables. An interaction effect between two variables describes the way that variables “work together” to have an effect on the dependent variable. This is perhaps best illustrated by an example. Suppose that an analyst wants to know whether the number of health care information workshops attended, as well as a person’s education, are associated with healthy lifestyle behaviors. Although we can surely theorize how attending health care information workshops and a person’s education can each affect an individual’s healthy lifestyle behaviors, it is also easy to see that the level of education can affect a person’s propensity for attending health care information workshops, as well. Hence, an interaction effect could also exist between these two independent variables (factors). The effects of each independent variable on the dependent variable are called main effects (as distinct from interaction effects). To continue the earlier example, suppose that in addition to population, an analyst also wants to consider a measure of the watershed’s preexisting condition, such as the number of plant and animal species at risk in the watershed. Two-way ANOVA produces the results shown in Table 13.4, using the transformed variable mentioned earlier. The first row, labeled “model,” refers to the combined effects of all main and interaction effects in the model on the dependent variable. This is the global F-test. The “model” row shows that the two main effects and the single interaction effect, when considered together, are significantly associated with changes in the dependent variable (p < .000). However, the results also show a reduced significance level of “population” (now, p = .064), which seems related to the interaction effect (p = .076). Although neither effect is significant at conventional levels, the results do suggest that an interaction effect is present between population and watershed condition (of which the number of at-risk species is an indicator) on watershed wetland loss. Post-hoc tests are only provided separately for each of the independent variables (factors), and the results show the same homogeneous grouping for both of the independent variables. Table 13.4 Two-Way ANOVA Results As we noted earlier, ANOVA is a family of statistical techniques that allow for a broad range of rather complex experimental designs. Complete coverage of these techniques is well beyond the scope of this book, but in general, many of these techniques aim to discern the effect of variables in the presence of other (control) variables. ANOVA is but one approach for addressing control variables. A far more common approach in public policy, economics, political science, and public administration (as well as in many others fields) is multiple regression (see Chapter 15). Many analysts feel that ANOVA and regression are largely equivalent. Historically, the preference for ANOVA stems from its uses in medical and agricultural research, with applications in education and psychology. Finally, the ANOVA approach can be generalized to allow for testing on two or more dependent variables. This approach is called multiple analysis of variance, or MANOVA. Regression-based analysis can also be used for dealing with multiple dependent variables, as mentioned in Chapter 17.”
Evan M. Berman, Essential Statistics for Public Managers and Policy Analysts
“COEFFICIENT The nonparametric alternative, Spearman’s rank correlation coefficient (r, or “rho”), looks at correlation among the ranks of the data rather than among the values. The ranks of data are determined as shown in Table 14.2 (adapted from Table 11.8): Table 14.2 Ranks of Two Variables In Greater Depth … Box 14.1 Crime and Poverty An analyst wants to examine empirically the relationship between crime and income in cities across the United States. The CD that accompanies the workbook Exercising Essential Statistics includes a Community Indicators dataset with assorted indicators of conditions in 98 cities such as Akron, Ohio; Phoenix, Arizona; New Orleans, Louisiana; and Seattle, Washington. The measures include median household income, total population (both from the 2000 U.S. Census), and total violent crimes (FBI, Uniform Crime Reporting, 2004). In the sample, household income ranges from $26,309 (Newark, New Jersey) to $71,765 (San Jose, California), and the median household income is $42,316. Per-capita violent crime ranges from 0.15 percent (Glendale, California) to 2.04 percent (Las Vegas, Nevada), and the median violent crime rate per capita is 0.78 percent. There are four types of violent crimes: murder and nonnegligent manslaughter, forcible rape, robbery, and aggravated assault. A measure of total violent crime per capita is calculated because larger cities are apt to have more crime. The analyst wants to examine whether income is associated with per-capita violent crime. The scatterplot of these two continuous variables shows that a negative relationship appears to be present: The Pearson’s correlation coefficient is –.532 (p < .01), and the Spearman’s correlation coefficient is –.552 (p < .01). The simple regression model shows R2 = .283. The regression model is as follows (t-test statistic in parentheses): The regression line is shown on the scatterplot. Interpreting these results, we see that the R-square value of .283 indicates a moderate relationship between these two variables. Clearly, some cities with modest median household incomes have a high crime rate. However, removing these cities does not greatly alter the findings. Also, an assumption of regression is that the error term is normally distributed, and further examination of the error shows that it is somewhat skewed. The techniques for examining the distribution of the error term are discussed in Chapter 15, but again, addressing this problem does not significantly alter the finding that the two variables are significantly related to each other, and that the relationship is of moderate strength. With this result in hand, further analysis shows, for example, by how much violent crime decreases for each increase in household income. For each increase of $10,000 in average household income, the violent crime rate drops 0.25 percent. For a city experiencing the median 0.78 percent crime rate, this would be a considerable improvement, indeed. Note also that the scatterplot shows considerable variation in the crime rate for cities at or below the median household income, in contrast to those well above it. Policy analysts may well wish to examine conditions that give rise to variation in crime rates among cities with lower incomes. Because Spearman’s rank correlation coefficient examines correlation among the ranks of variables, it can also be used with ordinal-level data.9 For the data in Table 14.2, Spearman’s rank correlation coefficient is .900 (p = .035).10 Spearman’s p-squared coefficient has a “percent variation explained” interpretation, similar”
Evan M. Berman, Essential Statistics for Public Managers and Policy Analysts
“eigenvalue of a factor is the sum of correlations (r) of each variable with that factor. This correlation is also called loading in factor analysis. Analysts can define (or “extract”) how many factors they wish to use, or they can define a statistical criterion (typically requiring each factor to have an eigenvalue of at least 1.0). The method of identifying factors is called principal component analysis (PCA). The results of PCA often make it difficult to interpret the factors, in which case the analyst will use rotation (a statistical technique that distributes the explained variance across factors). Rotation causes variables to load higher on one factor, and less on others, bringing the pattern of groups better into focus for interpretation. Several different methods of rotation are commonly used (for example, Varimax, Promax), but the purpose of this procedure is always to understand which variables belong together. Typically, for purposes of interpretation, factor loadings are considered only if their values are at least .50, and only these values might be shown in tables. Table 18.4 shows the result of a factor analysis. The table shows various items related to managerial professionalism, and the factor analysis identifies three distinct groups for these items. Such tables are commonly seen in research articles. The labels for each group (for example, “A. Commitment to performance”) are provided by the authors; note that the three groupings are conceptually distinct. The table also shows that, combined, these three factors account for 61.97 percent of the total variance. The table shows only loadings greater than .50; those below this value are not shown.6 Based on these results, the authors then create index variables for the three groups. Each group has high internal reliability (see Chapter 3); the Cronbach alpha scores are, respectively, 0.87, 0.83, and 0.88. This table shows a fairly typical use of factor analysis, providing statistical support for a grouping scheme. Beyond Factor Analysis A variety of exploratory techniques exist. Some seek purely to classify, whereas”
Evan M. Berman, Essential Statistics for Public Managers and Policy Analysts
“usually does not present much of a problem. Some analysts use t-tests with ordinal rather than continuous data for the testing variable. This approach is theoretically controversial because the distances among ordinal categories are undefined. This situation is avoided easily by using nonparametric alternatives (discussed later in this chapter). Also, when the grouping variable is not dichotomous, analysts need to make it so in order to perform a t-test. Many statistical software packages allow dichotomous variables to be created from other types of variables, such as by grouping or recoding ordinal or continuous variables. The second assumption is that the variances of the two distributions are equal. This is called homogeneity of variances. The use of pooled variances in the earlier formula is justified only when the variances of the two groups are equal. When variances are unequal (called heterogeneity of variances), revised formulas are used to calculate t-test test statistics and degrees of freedom.7 The difference between homogeneity and heterogeneity is shown graphically in Figure 12.2. Although we needn’t be concerned with the precise differences in these calculation methods, all t-tests first test whether variances are equal in order to know which t-test test statistic is to be used for subsequent hypothesis testing. Thus, every t-test involves a (somewhat tricky) two-step procedure. A common test for the equality of variances is the Levene’s test. The null hypothesis of this test is that variances are equal. Many statistical software programs provide the Levene’s test along with the t-test, so that users know which t-test to use—the t-test for equal variances or that for unequal variances. The Levene’s test is performed first, so that the correct t-test can be chosen. Figure 12.2 Equal and Unequal Variances The term robust is used, generally, to describe the extent to which test conclusions are unaffected by departures from test assumptions. T-tests are relatively robust for (hence, unaffected by) departures from assumptions of homogeneity and normality (see below) when groups are of approximately equal size. When groups are of about equal size, test conclusions about any difference between their means will be unaffected by heterogeneity. The third assumption is that observations are independent. (Quasi-) experimental research designs violate this assumption, as discussed in Chapter 11. The formula for the t-test test statistic, then, is modified to test whether the difference between before and after measurements is zero. This is called a paired t-test, which is discussed later in this chapter. The fourth assumption is that the distributions are normally distributed. Although normality is an important test assumption, a key reason for the popularity of the t-test is that t-test conclusions often are robust against considerable violations of normality assumptions that are not caused by highly skewed distributions. We provide some detail about tests for normality and how to address departures thereof. Remember, when nonnormality cannot be resolved adequately, analysts consider nonparametric alternatives to the t-test, discussed at the end of this chapter. Box 12.1 provides a bit more discussion about the reason for this assumption. A combination of visual inspection and statistical tests is always used to determine the normality of variables. Two tests of normality are the Kolmogorov-Smirnov test (also known as the K-S test) for samples with more than 50 observations and the Shapiro-Wilk test for samples with up to 50 observations. The null hypothesis of”
Evan M. Berman, Essential Statistics for Public Managers and Policy Analysts
“assumptions Understand the role of variable transformations Identify t-test alternatives When analysts need to compare the means of a continuous variable across different groups, they have a valuable tool at their disposal: the t-test. T-tests are used for testing whether two groups have different means of a continuous variable, such as when we want to know whether mean incomes vary between men and women. They could also be used to compare program performance between two periods, when performance in each period is measured as a continuous variable. The examples in this chapter differ from those in Chapters 10 and 11 in that in this chapter’s examples one of the variables is continuous and the other is categorical. Many variables are continuous, such as income, age, height, case loads, service calls, and counts of fish in a pond. Moreover, when ordinal-level variables are used for constructing index variables”
Evan M. Berman, Essential Statistics for Public Managers and Policy Analysts
“relationships are nonlinear (parabolic or otherwise heavily curved), it is not appropriate to use linear regression. Then, one or both variables must be transformed, as discussed in Chapter 12. Second, simple regression assumes that the linear relationship is constant over the range of observations. This assumption is violated when the relationship is “broken,” for example, by having an upward slope for the first half of independent variable values and a downward slope over the remaining values. Then, analysts should consider using two regression models each for these different, linear relationships. The linearity assumption is also violated when no relationship is present in part of the independent variable values. This is particularly problematic because regression analysis will calculate a regression slope based on all observations. In this case, analysts may be misled into believing that the linear pattern holds for all observations. Hence, regression results always should be verified through visual inspection. Third, simple regression assumes that the variables are continuous. In Chapter 15, we will see that regression can also be used for nominal and dichotomous independent variables. The dependent variable, however, must be continuous. When the dependent variable is dichotomous, logistic regression should be used (Chapter 16). Figure 14.2 Three Examples of r The following notations are commonly used in regression analysis. The predicted value of y (defined, based on the regression model, as y = a + bx) is typically different from the observed value of y. The predicted value of the dependent variable y is sometimes indicated as ŷ (pronounced “y-hat”). Only when R2 = 1 are the observed and predicted values identical for each observation. The difference between y and ŷ is called the regression error or error term”
Evan M. Berman, Essential Statistics for Public Managers and Policy Analysts
“Analysis of Variance (ANOVA)   CHAPTER OBJECTIVES After reading this chapter, you should be able to Use one-way ANOVA when the dependent variable is continuous and the independent variable is nominal or ordinal with two or more categories Understand the assumptions of ANOVA and how to test for them Use post-hoc tests Understand some extensions of one-way ANOVA This chapter provides an essential introduction to analysis of variance (ANOVA). ANOVA is a family of statistical techniques, the most basic of which is the one-way ANOVA, which provides an essential expansion of the t-test discussed in Chapter 12. One-way ANOVA allows analysts to test the effect of a continuous variable on an ordinal or nominal variable with two or more categories, rather than only two categories as is the case with the t-test. Thus, one-way ANOVA enables analysts to deal with problems such as whether the variable “region” (north, south, east, west) or “race” (Caucasian, African American, Hispanic, Asian, etc.) affects policy outcomes or any other matter that is measured on a continuous scale. One-way ANOVA also allows analysts to quickly determine subsets of categories with similar levels of the dependent variable. This chapter also addresses some extensions of one-way ANOVA and a nonparametric alternative. ANALYSIS OF VARIANCE Whereas the t-test is used for testing differences between two groups on a continuous variable (Chapter 12), one-way ANOVA is used for testing the means of a continuous variable across more than two groups. For example, we may wish to test whether income levels differ among three or more ethnic groups, or whether the counts of fish vary across three or more lakes. Applications of ANOVA often arise in medical and agricultural research, in which treatments are given to different groups of patients, animals, or crops. The F-test statistic compares the variances within each group against those that exist between each group and the overall mean: Key Point ANOVA extends the t-test; it is used when the independent variable is”
Evan M. Berman, Essential Statistics for Public Managers and Policy Analysts
“regression as dummy variables Explain the importance of the error term plot Identify assumptions of regression, and know how to test and correct assumption violations Multiple regression is one of the most widely used multivariate statistical techniques for analyzing three or more variables. This chapter uses multiple regression to examine such relationships, and thereby extends the discussion in Chapter 14. The popularity of multiple regression is due largely to the ease with which it takes control variables (or rival hypotheses) into account. In Chapter 10, we discussed briefly how contingency tables can be used for this purpose, but doing so is often a cumbersome and sometimes inconclusive effort. By contrast, multiple regression easily incorporates multiple independent variables. Another reason for its popularity is that it also takes into account nominal independent variables. However, multiple regression is no substitute for bivariate analysis. Indeed, managers or analysts with an interest in a specific bivariate relationship will conduct a bivariate analysis first, before examining whether the relationship is robust in the presence of numerous control variables. And before conducting bivariate analysis, analysts need to conduct univariate analysis to better understand their variables. Thus, multiple regression is usually one of the last steps of analysis. Indeed, multiple regression is often used to test the robustness of bivariate relationships when control variables are taken into account. The flexibility with which multiple regression takes control variables into account comes at a price, though. Regression, like the t-test, is based on numerous assumptions. Regression results cannot be assumed to be robust in the face of assumption violations. Testing of assumptions is always part of multiple regression analysis. Multiple regression is carried out in the following sequence: (1) model specification (that is, identification of dependent and independent variables), (2) testing of regression assumptions, (3) correction of assumption violations, if any, and (4) reporting of the results of the final regression model. This chapter examines these four steps and discusses essential concepts related to simple and multiple regression. Chapters 16 and 17 extend this discussion by examining the use of logistic regression and time series analysis. MODEL SPECIFICATION Multiple regression is an extension of simple regression, but an important difference exists between the two methods: multiple regression aims for full model specification. This means that analysts seek to account for all of the variables that affect the dependent variable; by contrast, simple regression examines the effect of only one independent variable. Philosophically, the phrase identifying the key difference—“all of the variables that affect the dependent variable”—is divided into two parts. The first part involves identifying the variables that are of most (theoretical and practical) relevance in explaining the dependent”
Evan M. Berman, Essential Statistics for Public Managers and Policy Analysts
“other and distinct from other groups. These techniques usually precede regression and other analyses. Factor analysis is a well-established technique that often aids in creating index variables. Earlier, Chapter 3 discussed the use of Cronbach alpha to empirically justify the selection of variables that make up an index. However, in that approach analysts must still justify that variables used in different index variables are indeed distinct. By contrast, factor analysis analyzes a large number of variables (often 20 to 30) and classifies them into groups based on empirical similarities and dissimilarities. This empirical assessment can aid analysts’ judgments regarding variables that might be grouped together. Factor analysis uses correlations among variables to identify subgroups. These subgroups (called factors) are characterized by relatively high within-group correlation among variables and low between-group correlation among variables. Most factor analysis consists of roughly four steps: (1) determining that the group of variables has enough correlation to allow for factor analysis, (2) determining how many factors should be used for classifying (or grouping) the variables, (3) improving the interpretation of correlations and factors (through a process called rotation), and (4) naming the factors and, possibly, creating index variables for subsequent analysis. Most factor analysis is used for grouping of variables (R-type factor analysis) rather than observations (Q-type). Often, discriminant analysis is used for grouping of observations, mentioned later in this chapter. The terminology of factor analysis differs greatly from that used elsewhere in this book, and the discussion that follows is offered as an aid in understanding tables that might be encountered in research that uses this technique. An important task in factor analysis is determining how many common factors should be identified. Theoretically, there are as many factors as variables, but only a few factors account for most of the variance in the data. The percentage of variation explained by each factor is defined as the eigenvalue divided by the number of variables, whereby the”
Evan M. Berman, Essential Statistics for Public Managers and Policy Analysts

« previous 1
All Quotes | Add A Quote