Essential Statistics for Public Managers and Policy Analysts Quotes

Rate this book
Clear rating
Essential Statistics for Public Managers and Policy Analysts Essential Statistics for Public Managers and Policy Analysts by Evan M. Berman
41 ratings, 3.22 average rating, 4 reviews
Essential Statistics for Public Managers and Policy Analysts Quotes Showing 1-30 of 53
“SUMMARY A vast array of additional statistical methods exists. In this concluding chapter, we summarized some of these methods (path analysis, survival analysis, and factor analysis) and briefly mentioned other related techniques. This chapter can help managers and analysts become familiar with these additional techniques and increase their access to research literature in which these techniques are used. Managers and analysts who would like more information about these techniques will likely consult other texts or on-line sources. In many instances, managers will need only simple approaches to calculate the means of their variables, produce a few good graphs that tell the story, make simple forecasts, and test for significant differences among a few groups. Why, then, bother with these more advanced techniques? They are part of the analytical world in which managers operate. Through research and consulting, managers cannot help but come in contact with them. It is hoped that this chapter whets the appetite and provides a useful reference for managers and students alike. KEY TERMS   Endogenous variables Exogenous variables Factor analysis Indirect effects Loading Path analysis Recursive models Survival analysis Notes 1. Two types of feedback loops are illustrated as follows: 2. When feedback loops are present, error terms for the different models will be correlated with exogenous variables, violating an error term assumption for such models. Then, alternative estimation methodologies are necessary, such as two-stage least squares and others discussed later in this chapter. 3. Some models may show double-headed arrows among error terms. These show the correlation between error terms, which is of no importance in estimating the beta coefficients. 4. In SPSS, survival analysis is available through the add-on module in SPSS Advanced Models. 5. The functions used to estimate probabilities are rather complex. They are so-called Weibull distributions, which are defined as h(t) = αλ(λt)a–1, where a and 1 are chosen to best fit the data. 6. Hence, the SSL is greater than the squared loadings reported. For example, because the loadings of variables in groups B and C are not shown for factor 1, the SSL of shown loadings is 3.27 rather than the reported 4.084. If one assumes the other loadings are each .25, then the SSL of the not reported loadings is [12*.252 =] .75, bringing the SSL of factor 1 to [3.27 + .75 =] 4.02, which is very close to the 4.084 value reported in the table. 7. Readers who are interested in multinomial logistic regression can consult on-line sources or the SPSS manual, Regression Models 10.0 or higher. The statistics of discriminant analysis are very dissimilar from those of logistic regression, and readers are advised to consult a separate text on that topic. Discriminant analysis is not often used in public”
Evan M. Berman, Essential Statistics for Public Managers and Policy Analysts
“regression lines that describe the relationship of the independent variables for each group (called classification functions). The emphasis in discriminant analysis is the ability of the independent variables to correctly predict values of the nominal variable (for example, group membership). Discriminant analysis is one strategy for dealing with dependent variables that are nominal with three or more categories. Multinomial logistic regression and ordinal regression have been developed in recent years to address nominal and ordinal dependent variables in logic regression. Multinomial logistic regression calculates functions that compare the probability of a nominal value occurring relative to a base reference group. The calculation of such probabilities makes this technique an interesting alternative to discriminant analysis. When the nominal dependent variable has three values (say, 1, 2, and 3), one logistic regression predicts the likelihood of 2 versus 1 occurring, and the other logistic regression predicts the likelihood of 3 versus 1 occurring, assuming that “1” is the base reference group.7 When the dependent variable is ordinal, ordinal regression can be used. Like multinomial logistic regression, ordinal regression often is used to predict event probability or group membership. Ordinal regression assumes that the slope coefficients are identical for each value of the dependent variable; when this assumption is not met, multinomial logistic regression should be considered. Both multinomial logistic regression and ordinal regression are relatively recent developments and are not yet widely used. Statistics, like other fields of science, continues to push its frontiers forward and thereby develop new techniques for managers and analysts. Key Point Advanced statistical tools are available. Understanding the proper circumstances under which these tools apply is a prerequisite for using them.”
Evan M. Berman, Essential Statistics for Public Managers and Policy Analysts
“others seek to create and predict classifications through independent variables. Table 18.4 Factor Analysis Note: Factor analysis with Varimax rotation. Source: E. Berman and J. West. (2003). “What Is Managerial Mediocrity? Definition, Prevalence and Negative Impact (Part 1).” Public Performance & Management Review, 27 (December): 7–27. Multidimensional scaling and cluster analysis aim to identify key dimensions along which observations (rather than variables) differ. These techniques differ from factor analysis in that they allow for a hierarchy of classification dimensions. Some also use graphics to aid in visualizing the extent of differences and to help in identifying the similarity or dissimilarity of observations. Network analysis is a descriptive technique used to portray relationships among actors. A graphic representation can be made of the frequency with which actors interact with each other, distinguishing frequent interactions from those that are infrequent. Discriminant analysis is used when the dependent variable is nominal with two or more categories. For example, we might want to know how parents choose among three types of school vouchers. Discriminant analysis calculates regression lines that distinguish (discriminate) among the nominal groups (the categories of the dependent variable), as well as other”
Evan M. Berman, Essential Statistics for Public Managers and Policy Analysts
“eigenvalue of a factor is the sum of correlations (r) of each variable with that factor. This correlation is also called loading in factor analysis. Analysts can define (or “extract”) how many factors they wish to use, or they can define a statistical criterion (typically requiring each factor to have an eigenvalue of at least 1.0). The method of identifying factors is called principal component analysis (PCA). The results of PCA often make it difficult to interpret the factors, in which case the analyst will use rotation (a statistical technique that distributes the explained variance across factors). Rotation causes variables to load higher on one factor, and less on others, bringing the pattern of groups better into focus for interpretation. Several different methods of rotation are commonly used (for example, Varimax, Promax), but the purpose of this procedure is always to understand which variables belong together. Typically, for purposes of interpretation, factor loadings are considered only if their values are at least .50, and only these values might be shown in tables. Table 18.4 shows the result of a factor analysis. The table shows various items related to managerial professionalism, and the factor analysis identifies three distinct groups for these items. Such tables are commonly seen in research articles. The labels for each group (for example, “A. Commitment to performance”) are provided by the authors; note that the three groupings are conceptually distinct. The table also shows that, combined, these three factors account for 61.97 percent of the total variance. The table shows only loadings greater than .50; those below this value are not shown.6 Based on these results, the authors then create index variables for the three groups. Each group has high internal reliability (see Chapter 3); the Cronbach alpha scores are, respectively, 0.87, 0.83, and 0.88. This table shows a fairly typical use of factor analysis, providing statistical support for a grouping scheme. Beyond Factor Analysis A variety of exploratory techniques exist. Some seek purely to classify, whereas”
Evan M. Berman, Essential Statistics for Public Managers and Policy Analysts
“other and distinct from other groups. These techniques usually precede regression and other analyses. Factor analysis is a well-established technique that often aids in creating index variables. Earlier, Chapter 3 discussed the use of Cronbach alpha to empirically justify the selection of variables that make up an index. However, in that approach analysts must still justify that variables used in different index variables are indeed distinct. By contrast, factor analysis analyzes a large number of variables (often 20 to 30) and classifies them into groups based on empirical similarities and dissimilarities. This empirical assessment can aid analysts’ judgments regarding variables that might be grouped together. Factor analysis uses correlations among variables to identify subgroups. These subgroups (called factors) are characterized by relatively high within-group correlation among variables and low between-group correlation among variables. Most factor analysis consists of roughly four steps: (1) determining that the group of variables has enough correlation to allow for factor analysis, (2) determining how many factors should be used for classifying (or grouping) the variables, (3) improving the interpretation of correlations and factors (through a process called rotation), and (4) naming the factors and, possibly, creating index variables for subsequent analysis. Most factor analysis is used for grouping of variables (R-type factor analysis) rather than observations (Q-type). Often, discriminant analysis is used for grouping of observations, mentioned later in this chapter. The terminology of factor analysis differs greatly from that used elsewhere in this book, and the discussion that follows is offered as an aid in understanding tables that might be encountered in research that uses this technique. An important task in factor analysis is determining how many common factors should be identified. Theoretically, there are as many factors as variables, but only a few factors account for most of the variance in the data. The percentage of variation explained by each factor is defined as the eigenvalue divided by the number of variables, whereby the”
Evan M. Berman, Essential Statistics for Public Managers and Policy Analysts
“Note: The median survival time is 5.19. Survival analysis can also examine survival rates for different “treatments” or conditions. Assume that data are available about the number of dependents that each client has. Table 18.3 is readily produced for each subset of this condition. For example, by comparing the survival rates of those with and those without dependents, the probability density figure, which shows the likelihood of an event occurring, can be obtained (Figure 18.5). This figure suggests that having dependents is associated with clients’ finding employment somewhat faster. Beyond Life Tables Life tables require that the interval (time) variable be measured on a discrete scale. When the time variable is continuous, Kaplan-Meier survival analysis is used. This procedure is quite analogous to life tables analysis. Cox regression is similar to Kaplan-Meier but allows for consideration of a larger number of independent variables (called covariates). In all instances, the purpose is to examine the effect of treatment on the survival of observations, that is, the occurrence of a dichotomous event. Figure 18.5 Probability Density FACTOR ANALYSIS A variety of statistical techniques help analysts to explore relationships in their data. These exploratory techniques typically aim to create groups of variables (or observations) that are related to each”
Evan M. Berman, Essential Statistics for Public Managers and Policy Analysts
“observation is simply an observation for which a specified outcome has not yet occurred. Assume that data exist from a random sample of 100 clients who are seeking, or have found, employment. Survival analysis is the statistical procedure for analyzing these data. The name of this procedure stems from its use in medical research. In clinical trials, researchers want to know the survival (or disease) rate of patients as a function of the duration of their treatment. For patients in the middle of their trial, the specified outcome may not have occurred yet. We obtain the following results (also called a life table) from analyzing hypothetical data from welfare records (see Table 18.3). In the context shown in the table, the word terminal signifies that the event has occurred. That is, the client has found employment. At start time zero, 100 cases enter the interval. During the first period, there are no terminal cases and nine censored cases. Thus, 91 cases enter the next period. In this second period, 2 clients find employment and 14 do not, resulting in 75 cases that enter the following period. The column labeled “Cumulative proportion surviving until end of interval” is an estimate of probability of surviving (not finding employment) until the end of the stated interval.5 The column labeled “Probability density” is an estimate of the probability of the terminal event occurring (that is, finding employment) during the time interval. The results also report that “the median survival time is 5.19.” That is, half of the clients find employment in 5.19 weeks. Table 18.2 Censored Observations Note: Obs = observations (clients); Emp = employment; 0 = has not yet found employment; 1 = has found employment. Table 18.3 Life Table Results”
Evan M. Berman, Essential Statistics for Public Managers and Policy Analysts
“Assume that a welfare manager in our earlier example (see discussion of path analysis) takes a snapshot of the status of the welfare clients. Some clients may have obtained employment and others not yet. Clients will also vary as to the amount of time that they have been receiving welfare. Examine the data in Table 18.2. It shows that neither of the two clients, who have yet to complete their first week on welfare, has found employment; one of the three clients who have completed one week of welfare has found employment. Censored observations are observations for which the specified outcome has yet to occur. It is assumed that all clients who have not yet found employment are still waiting for this event to occur. Thus, the sample should not include clients who are not seeking employment. Note, however, that a censored observation is very different from one that has missing data, which might occur because the manager does not know whether the client has found employment. As with regression, records with missing data are excluded from analysis. A censored”
Evan M. Berman, Essential Statistics for Public Managers and Policy Analysts
“regression results. Standardized Coefficients The question arises as to which independent variable has the greatest impact on explaining the dependent variable. The slope of the coefficients (b) does not answer this question because each slope is measured in different units (recall from Chapter 14 that b = ∆y/∆x). Comparing different slope coefficients is tantamount to comparing apples and oranges. However, based on the regression coefficient (or slope), it is possible to calculate the standardized coefficient, β (beta). Beta is defined as the change produced in the dependent variable by a unit of change in the independent variable when both variables are measured in terms of standard deviation units. Beta is unit-less and thus allows for comparison of the impact of different independent variables on explaining the dependent variable. Analysts compare the relative values of beta coefficients; beta has no inherent meaning. It is appropriate to compare betas across independent variables in the same regression, not across different regressions. Based on Table 15.1, we conclude that the impact of having adequate authority on explaining productivity is [(0.288 – 0.202)/0.202 =] 42.6 percent greater than teamwork, and about equal to that of knowledge. The impact of having adequate authority is two-and-a-half times greater than that of perceptions of fair rewards and recognition.4 F-Test Table 15.1 also features an analysis of variance (ANOVA) table. The global F-test examines the overall effect of all independent variables jointly on the dependent variable. The null hypothesis is that the overall effect of all independent variables jointly on the dependent variables is statistically insignificant. The alternate hypothesis is that this overall effect is statistically significant. The null hypothesis implies that none of the regression coefficients is statistically significant; the alternate hypothesis implies that at least one of the regression coefficients is statistically significant. The”
Evan M. Berman, Essential Statistics for Public Managers and Policy Analysts
“violations of regression assumptions, and strategies for examining and remedying such assumptions. Then we extend the preceding discussion and will be able to conclude whether the above results are valid. Again, this model is not the only model that can be constructed but rather is one among a family of plausible models. Indeed, from a theoretical perspective, other variables might have been included, too. From an empirical perspective, perhaps other variables might explain more variance. Model specification is a judicious effort, requiring a balance between theoretical and statistical integrity. Statistical software programs can also automatically select independent variables based on their statistical significance, hence, adding to R-square.2 However, models with high R-square values are not necessarily better; theoretical reasons must exist for selecting independent variables, explaining why and how they might be related to the dependent variable. Knowing which variables are related empirically to the dependent variable can help narrow the selection, but such knowledge should not wholly determine it. We now turn to a discussion of the other statistics shown in Table 15.1. Getting Started Find examples of multiple regression in the research literature. Figure 15.1 Dependent Variable: Productivity FURTHER STATISTICS Goodness of Fit for Multiple Regression The model R-square in Table 15.1 is greatly increased over that shown in Table 14.1: R-square has gone from 0.074 in the simple regression model to 0.274. However, R-square has the undesirable mathematical property of increasing with the number of independent variables in the model. R-square increases regardless of whether an additional independent variable adds further explanation of the dependent variable. The adjusted R-square (or ) controls for the number of independent variables. is always equal to or less than R2. The above increase in explanation of the dependent variable is due to variables identified as statistically significant in Table 15.1. Key Point R-square is the variation in the dependent variable that is explained by all the independent variables. Adjusted R-square is often used to evaluate model explanation (or fit). Analogous with simple regression, values of below 0.20 are considered to suggest weak model fit, those between 0.20 and 0.40 indicate moderate fit, those above 0.40 indicate strong fit, and those above 0.65 indicate very strong model fit. Analysts should remember that choices of model specification are driven foremost by theory, not statistical model fit; strong model fit is desirable only when the variables, and their relationships, are meaningful in some real-life sense. Adjusted R-square can assist in the variable selection process. Low values of adjusted R-square prompt analysts to ask whether they inadvertently excluded important variables from their models; if included, these variables might affect the statistical significance of those already in a model.3 Adjusted R-square also helps analysts to choose among alternative variable specifications (for example, different measures of student isolation), when such choices are no longer meaningfully informed by theory. Empirical issues of model fit then usefully guide the selection process further. Researchers typically report adjusted R-square with their”
Evan M. Berman, Essential Statistics for Public Managers and Policy Analysts
“Thus, multiple regression requires two important tasks: (1) specification of independent variables and (2) testing of the error term. An important difference between simple regression and multiple regression is the interpretation of the regression coefficients in multiple regression (b1, b2, b3, …) in the preceding multiple regression model. Although multiple regression produces the same basic statistics discussed in Chapter 14 (see Table 14.1), each of the regression coefficients is interpreted as its effect on the dependent variable, controlled for the effects of all of the other independent variables included in the regression. This phrase is used frequently when explaining multiple regression results. In our example, the regression coefficient b1 shows the effect of x1 on y, controlled for all other variables included in the model. Regression coefficient b2 shows the effect of x2 on y, also controlled for all other variables in the model, including x1. Multiple regression is indeed an important and relatively simple way of taking control variables into account (and much easier than the approach shown in Appendix 10.1). Key Point The regression coefficient is the effect on the dependent variable, controlled for all other independent variables in the model. Note also that the model given here is very different from estimating separate simple regression models for each of the independent variables. The regression coefficients in simple regression do not control for other independent variables, because they are not in the model. The word independent also means that each independent variable should be relatively unaffected by other independent variables in the model. To ensure that independent variables are indeed independent, it is useful to think of the distinctively different types (or categories) of factors that affect a dependent variable. This was the approach taken in the preceding example. There is also a statistical reason for ensuring that independent variables are as independent as possible. When two independent variables are highly correlated with each other (r2 > .60), it sometimes becomes statistically impossible to distinguish the effect of each independent variable on the dependent variable, controlled for the other. The variables are statistically too similar to discern disparate effects. This problem is called multicollinearity and is discussed later in this chapter. This problem is avoided by choosing independent variables that are not highly correlated with each other. A WORKING EXAMPLE Previously (see Chapter 14), the management analyst with the Department of Defense found a statistically significant relationship between teamwork and perceived facility productivity (p <.01). The analyst now wishes to examine whether the impact of teamwork on productivity is robust when controlled for other factors that also affect productivity. This interest is heightened by the low R-square (R2 = 0.074) in Table 14.1, suggesting a weak relationship between teamwork and perceived productivity. A multiple regression model is specified to include the effects of other factors that affect perceived productivity. Thinking about other categories of variables that could affect productivity, the analyst hypothesizes the following: (1) the extent to which employees have adequate technical knowledge to do their jobs, (2) perceptions of having adequate authority to do one’s job well (for example, decision-making flexibility), (3) perceptions that rewards and recognition are distributed fairly (always important for motivation), and (4) the number of sick days. Various items from the employee survey are used to measure these concepts (as discussed in the workbook documentation for the Productivity dataset). After including these factors as additional independent variables, the result shown in Table 15.1 is”
Evan M. Berman, Essential Statistics for Public Managers and Policy Analysts
“single or index variables. As an example, consider the dependent variable “high school violence,” discussed in Chapter 2. We ask: “What are the most important, distinct factors affecting or causing high school violence?” Some plausible factors are (1) student access to weapons, (2) student isolation from others, (3) peer groups that are prone to violence, (4) lack of enforcement of school nonviolence policies, (5) participation in anger management programs, and (6) familiarity with warning signals (among teachers and staff). Perhaps you can think of other factors. Then, following the strategies discussed in Chapter 3—conceptualization, operationalization, and index variable construction—we use either single variables or index measures as independent variables to measure each of these factors. This approach provides for the inclusion of programs or policies as independent variables, as well as variables that measure salient rival hypotheses. The strategy of full model specification requires that analysts not overlook important factors. Thus, analysts do well to carefully justify their model and to consult past studies and interview those who have direct experience with, or other opinions about, the research subject. Doing so might lead analysts to include additional variables, such as the socioeconomic status of students’ parents. Then, after a fully specified model has been identified, analysts often include additional variables of interest. These may be variables of lesser relevance, speculative consequences, or variables that analysts want to test for their lack of impact, such as rival hypotheses. Demographic variables, such as the age of students, might be added. When additional variables are included, analysts should identify which independent variables constitute the nomothetic explanation, and which serve some other purpose. Remember, all variables included in models must be theoretically justified. Analysts must argue how each variable could plausibly affect their dependent variable. The second part of “all of the variables that affect the dependent variable” acknowledges all of the other variables that are not identified (or included) in the model. They are omitted; these variables are not among “the most important factors” that affect the dependent variable. The cumulative effect of these other variables is, by definition, contained in the error term, described later in this chapter. The assumption of full model specification is that these other variables are justifiably omitted only when their cumulative effect on the dependent variable is zero. This approach is plausible because each of these many unknown variables may have a different magnitude, thus making it possible that their effects cancel each other out. The argument, quite clearly, is not that each of these other factors has no impact on the dependent variable—but only that their cumulative effect is zero. The validity of multiple regression models centers on examining the behavior of the error term in this regard. If the cumulative effect of all the other variables is not zero, then additional independent variables may have to be considered. The specification of the multiple regression model is as follows:”
Evan M. Berman, Essential Statistics for Public Managers and Policy Analysts
“variable. In social science, this is called a nomothetic mode of explanation—the isolation of the most important factors. This approach is consistent with the philosophy of seeking complete but parsimonious explanations in science.1 The second part involves addressing those variables that were not considered as being of most relevance. Regarding the first part, the specification of the “most important” independent variables is a judicious undertaking. The use of a nomothetic strategy implies that a range of plausible models exists—different analysts may identify different sets of “most important” independent variables. Analysts should ask which different factors are most likely to affect or cause their dependent variable, and they are likely to justify, identify, and operationalize their choices differently. Thus, the term full model specification does not imply that only one model or even a best model exists, but rather it refers to a family of plausible models. Most researchers agree that specification should (1) be driven by theory, that is, by persuasive arguments and perspectives that identify and justify which factors are most important, and (2) inform why the set of such variables is regarded as complete and parsimonious. In practice, the search for complete, parsimonious, and theory-driven explanations usually results in multiple regression models with about 5–12 independent variables; theory seldom results in less than 5 variables, and parsimony and problems of statistical estimation, discussed further, seldom result in models with more than 12. Key Point We cannot examine the effect of all possible variables. Rather, we focus on the most relevant ones. The search for parsimonious explanations often leads analysts to first identify different categories of factors that most affect their dependent variable. Then, after these categories of factors have been identified, analysts turn to the task of trying to measure each, through either single or index variables. As an example,”
Evan M. Berman, Essential Statistics for Public Managers and Policy Analysts
“regression as dummy variables Explain the importance of the error term plot Identify assumptions of regression, and know how to test and correct assumption violations Multiple regression is one of the most widely used multivariate statistical techniques for analyzing three or more variables. This chapter uses multiple regression to examine such relationships, and thereby extends the discussion in Chapter 14. The popularity of multiple regression is due largely to the ease with which it takes control variables (or rival hypotheses) into account. In Chapter 10, we discussed briefly how contingency tables can be used for this purpose, but doing so is often a cumbersome and sometimes inconclusive effort. By contrast, multiple regression easily incorporates multiple independent variables. Another reason for its popularity is that it also takes into account nominal independent variables. However, multiple regression is no substitute for bivariate analysis. Indeed, managers or analysts with an interest in a specific bivariate relationship will conduct a bivariate analysis first, before examining whether the relationship is robust in the presence of numerous control variables. And before conducting bivariate analysis, analysts need to conduct univariate analysis to better understand their variables. Thus, multiple regression is usually one of the last steps of analysis. Indeed, multiple regression is often used to test the robustness of bivariate relationships when control variables are taken into account. The flexibility with which multiple regression takes control variables into account comes at a price, though. Regression, like the t-test, is based on numerous assumptions. Regression results cannot be assumed to be robust in the face of assumption violations. Testing of assumptions is always part of multiple regression analysis. Multiple regression is carried out in the following sequence: (1) model specification (that is, identification of dependent and independent variables), (2) testing of regression assumptions, (3) correction of assumption violations, if any, and (4) reporting of the results of the final regression model. This chapter examines these four steps and discusses essential concepts related to simple and multiple regression. Chapters 16 and 17 extend this discussion by examining the use of logistic regression and time series analysis. MODEL SPECIFICATION Multiple regression is an extension of simple regression, but an important difference exists between the two methods: multiple regression aims for full model specification. This means that analysts seek to account for all of the variables that affect the dependent variable; by contrast, simple regression examines the effect of only one independent variable. Philosophically, the phrase identifying the key difference—“all of the variables that affect the dependent variable”—is divided into two parts. The first part involves identifying the variables that are of most (theoretical and practical) relevance in explaining the dependent”
Evan M. Berman, Essential Statistics for Public Managers and Policy Analysts
“Multiple Regression   CHAPTER OBJECTIVES After reading this chapter, you should be able to Understand multiple regression as a full model specification technique Interpret standardized and unstandardized regression coefficients of multiple regression Know how to use nominal variables in”
Evan M. Berman, Essential Statistics for Public Managers and Policy Analysts
“to the measures described earlier. Hence, 90 percent of the variation in one variable can be explained by the other. For the variables given earlier, the Spearman’s rank correlation coefficient is .274 (p < .01), which is comparable to r reported in preceding sections. Box 14.1 illustrates another use of the statistics described in this chapter, in a study of the relationship between crime and poverty. SUMMARY When analysts examine relationships between two continuous variables, they can use simple regression or the Pearson’s correlation coefficient. Both measures show (1) the statistical significance of the relationship, (2) the direction of the relationship (that is, whether it is positive or negative), and (3) the strength of the relationship. Simple regression assumes a causal and linear relationship between the continuous variables. The statistical significance and direction of the slope coefficient is used to assess the statistical significance and direction of the relationship. The coefficient of determination, R2, is used to assess the strength of relationships; R2 is interpreted as the percent variation explained. Regression is a foundation for studying relationships involving three or more variables, such as control variables. The Pearson’s correlation coefficient does not assume causality between two continuous variables. A nonparametric alternative to testing the relationship between two continuous variables is the Spearman’s rank correlation coefficient, which examines correlation among the ranks of the data rather than among the values themselves. As such, this measure can also be used to study relationships in which one or both variables are ordinal. KEY TERMS   Coefficient of determination, R2 Error term Observed value of y Pearson’s correlation coefficient, r Predicted value of the dependent variable y, ŷ Regression coefficient Regression line Scatterplot Simple regression assumptions Spearman’s rank correlation coefficient Standard error of the estimate Test of significance of the regression coefficient Notes   1. See Chapter 3 for a definition of continuous variables. Although the distinction between ordinal and continuous is theoretical (namely, whether or not the distance between categories can be measured), in practice ordinal-level variables with seven or more categories (including Likert variables) are sometimes analyzed using statistics appropriate for interval-level variables. This practice has many critics because it violates an assumption of regression (interval data), but it is often”
Evan M. Berman, Essential Statistics for Public Managers and Policy Analysts
“COEFFICIENT The nonparametric alternative, Spearman’s rank correlation coefficient (r, or “rho”), looks at correlation among the ranks of the data rather than among the values. The ranks of data are determined as shown in Table 14.2 (adapted from Table 11.8): Table 14.2 Ranks of Two Variables In Greater Depth … Box 14.1 Crime and Poverty An analyst wants to examine empirically the relationship between crime and income in cities across the United States. The CD that accompanies the workbook Exercising Essential Statistics includes a Community Indicators dataset with assorted indicators of conditions in 98 cities such as Akron, Ohio; Phoenix, Arizona; New Orleans, Louisiana; and Seattle, Washington. The measures include median household income, total population (both from the 2000 U.S. Census), and total violent crimes (FBI, Uniform Crime Reporting, 2004). In the sample, household income ranges from $26,309 (Newark, New Jersey) to $71,765 (San Jose, California), and the median household income is $42,316. Per-capita violent crime ranges from 0.15 percent (Glendale, California) to 2.04 percent (Las Vegas, Nevada), and the median violent crime rate per capita is 0.78 percent. There are four types of violent crimes: murder and nonnegligent manslaughter, forcible rape, robbery, and aggravated assault. A measure of total violent crime per capita is calculated because larger cities are apt to have more crime. The analyst wants to examine whether income is associated with per-capita violent crime. The scatterplot of these two continuous variables shows that a negative relationship appears to be present: The Pearson’s correlation coefficient is –.532 (p < .01), and the Spearman’s correlation coefficient is –.552 (p < .01). The simple regression model shows R2 = .283. The regression model is as follows (t-test statistic in parentheses): The regression line is shown on the scatterplot. Interpreting these results, we see that the R-square value of .283 indicates a moderate relationship between these two variables. Clearly, some cities with modest median household incomes have a high crime rate. However, removing these cities does not greatly alter the findings. Also, an assumption of regression is that the error term is normally distributed, and further examination of the error shows that it is somewhat skewed. The techniques for examining the distribution of the error term are discussed in Chapter 15, but again, addressing this problem does not significantly alter the finding that the two variables are significantly related to each other, and that the relationship is of moderate strength. With this result in hand, further analysis shows, for example, by how much violent crime decreases for each increase in household income. For each increase of $10,000 in average household income, the violent crime rate drops 0.25 percent. For a city experiencing the median 0.78 percent crime rate, this would be a considerable improvement, indeed. Note also that the scatterplot shows considerable variation in the crime rate for cities at or below the median household income, in contrast to those well above it. Policy analysts may well wish to examine conditions that give rise to variation in crime rates among cities with lower incomes. Because Spearman’s rank correlation coefficient examines correlation among the ranks of variables, it can also be used with ordinal-level data.9 For the data in Table 14.2, Spearman’s rank correlation coefficient is .900 (p = .035).10 Spearman’s p-squared coefficient has a “percent variation explained” interpretation, similar”
Evan M. Berman, Essential Statistics for Public Managers and Policy Analysts
“(e). Hence the expressions are equivalent, as is y = ŷ + e. Certain assumptions about e are important, such as that it is normally distributed. When error term assumptions are violated, incorrect conclusions may be made about the statistical significance of relationships. This important issue is discussed in greater detail in Chapter 15 and, for time series data, in Chapter 17. Hence, the above is a pertinent but incomplete list of assumptions. Getting Started Conduct a simple regression, and practice writing up your results. PEARSON’S CORRELATION COEFFICIENT Pearson’s correlation coefficient, r, measures the association (significance, direction, and strength) between two continuous variables; it is a measure of association for two continuous variables. Also called the Pearson’s product-moment correlation coefficient, it does not assume a causal relationship, as does simple regression. The correlation coefficient indicates the extent to which the observations lie closely or loosely clustered around the regression line. The coefficient r ranges from –1 to +1. The sign indicates the direction of the relationship, which, in simple regression, is always the same as the slope coefficient. A “–1” indicates a perfect negative relationship, that is, that all observations lie exactly on a downward-sloping regression line; a “+1” indicates a perfect positive relationship, whereby all observations lie exactly on an upward-sloping regression line. Of course, such values are rarely obtained in practice because observations seldom lie exactly on a line. An r value of zero indicates that observations are so widely scattered that it is impossible to draw any well-fitting line. Figure 14.2 illustrates some values of r. Key Point Pearson’s correlation coefficient, r, ranges from –1 to +1. It is important to avoid confusion between Pearson’s correlation coefficient and the coefficient of determination. For the two-variable, simple regression model, r2 = R2, but whereas 0 ≤ R ≤ 1, r ranges from –1 to +1. Hence, the sign of r tells us whether a relationship is positive or negative, but the sign of R, in regression output tables such as Table 14.1, is always positive and cannot inform us about the direction of the relationship. In simple regression, the regression coefficient, b, informs us about the direction of the relationship. Statistical software programs usually show r rather than r2. Note also that the Pearson’s correlation coefficient can be used only to assess the association between two continuous variables, whereas regression can be extended to deal with more than two variables, as discussed in Chapter 15. Pearson’s correlation coefficient assumes that both variables are normally distributed. When Pearson’s correlation coefficients are calculated, a standard error of r can be determined, which then allows us to test the statistical significance of the bivariate correlation. For bivariate relationships, this is the same level of significance as shown for the slope of the regression coefficient. For the variables given earlier in this chapter, the value of r is .272 and the statistical significance of r is p ≤ .01. Use of the Pearson’s correlation coefficient assumes that the variables are normally distributed and that there are no significant departures from linearity.7 It is important not to confuse the correlation coefficient, r, with the regression coefficient, b. Comparing the measures r and b (the slope) sometimes causes confusion. The key point is that r does not indicate the regression slope but rather the extent to which observations lie close to it. A steep regression line (large b) can have observations scattered loosely or closely around it, as can a shallow (more horizontal) regression line. The purposes of these two statistics are very different.8 SPEARMAN’S RANK CORRELATION”
Evan M. Berman, Essential Statistics for Public Managers and Policy Analysts
“relationships are nonlinear (parabolic or otherwise heavily curved), it is not appropriate to use linear regression. Then, one or both variables must be transformed, as discussed in Chapter 12. Second, simple regression assumes that the linear relationship is constant over the range of observations. This assumption is violated when the relationship is “broken,” for example, by having an upward slope for the first half of independent variable values and a downward slope over the remaining values. Then, analysts should consider using two regression models each for these different, linear relationships. The linearity assumption is also violated when no relationship is present in part of the independent variable values. This is particularly problematic because regression analysis will calculate a regression slope based on all observations. In this case, analysts may be misled into believing that the linear pattern holds for all observations. Hence, regression results always should be verified through visual inspection. Third, simple regression assumes that the variables are continuous. In Chapter 15, we will see that regression can also be used for nominal and dichotomous independent variables. The dependent variable, however, must be continuous. When the dependent variable is dichotomous, logistic regression should be used (Chapter 16). Figure 14.2 Three Examples of r The following notations are commonly used in regression analysis. The predicted value of y (defined, based on the regression model, as y = a + bx) is typically different from the observed value of y. The predicted value of the dependent variable y is sometimes indicated as ŷ (pronounced “y-hat”). Only when R2 = 1 are the observed and predicted values identical for each observation. The difference between y and ŷ is called the regression error or error term”
Evan M. Berman, Essential Statistics for Public Managers and Policy Analysts
“Table 14.1 also shows R-square (R2), which is called the coefficient of determination. R-square is of great interest: its value is interpreted as the percentage of variation in the dependent variable that is explained by the independent variable. R-square varies from zero to one, and is called a goodness-of-fit measure.5 In our example, teamwork explains only 7.4 percent of the variation in productivity. Although teamwork is significantly associated with productivity, it is quite likely that other factors also affect it. It is conceivable that other factors might be more strongly associated with productivity and that, when controlled for other factors, teamwork is no longer significant. Typically, values of R2 below 0.20 are considered to indicate weak relationships, those between 0.20 and 0.40 indicate moderate relationships, and those above 0.40 indicate strong relationships. Values of R2 above 0.65 are considered to indicate very strong relationships. R is called the multiple correlation coefficient and is always 0 ≤ R ≤ 1. To summarize up to this point, simple regression provides three critically important pieces of information about bivariate relationships involving two continuous variables: (1) the level of significance at which two variables are associated, if at all (t-statistic), (2) whether the relationship between the two variables is positive or negative (b), and (3) the strength of the relationship (R2). Key Point R-square is a measure of the strength of the relationship. Its value goes from 0 to 1. The primary purpose of regression analysis is hypothesis testing, not prediction. In our example, the regression model is used to test the hypothesis that teamwork is related to productivity. However, if the analyst wants to predict the variable “productivity,” the regression output also shows the SEE, or the standard error of the estimate (see Table 14.1). This is a measure of the spread of y values around the regression line as calculated for the mean value of the independent variable, only, and assuming a large sample. The standard error of the estimate has an interpretation in terms of the normal curve, that is, 68 percent of y values lie within one standard error from the calculated value of y, as calculated for the mean value of x using the preceding regression model. Thus, if the mean index value of the variable “teamwork” is 5.0, then the calculated (or predicted) value of “productivity” is [4.026 + 0.223*5 =] 5.141. Because SEE = 0.825, it follows that 68 percent of productivity values will lie 60.825 from 5.141 when “teamwork” = 5. Predictions of y for other values of x have larger standard errors.6 Assumptions and Notation There are three simple regression assumptions. First, simple regression assumes that the relationship between two variables is linear. The linearity of bivariate relationships is easily determined through visual inspection, as shown in Figure 14.2. In fact, all analysis of relationships involving continuous variables should begin with a scatterplot. When variable”
Evan M. Berman, Essential Statistics for Public Managers and Policy Analysts
“regression line will have larger standard deviations and, hence, larger standard errors. The computer calculates the slope, intercept, standard error of the slope, and the level at which the slope is statistically significant. Key Point The significance of the slope tests the relationship. Consider the following example. A management analyst with the Department of Defense wishes to evaluate the impact of teamwork on the productivity of naval shipyard repair facilities. Although all shipyards are required to use teamwork management strategies, these strategies are assumed to vary in practice. Coincidentally, a recently implemented employee survey asked about the perceived use and effectiveness of teamwork. These items have been aggregated into a single index variable that measures teamwork. Employees were also asked questions about perceived performance, as measured by productivity, customer orientation, planning and scheduling, and employee motivation. These items were combined into an index measure of work productivity. Both index measures are continuous variables. The analyst wants to know whether a relationship exists between perceived productivity and teamwork. Table 14.1 shows the computer output obtained from a simple regression. The slope, b, is 0.223; the slope coefficient of teamwork is positive; and the slope is significant at the 1 percent level. Thus, perceptions of teamwork are positively associated with productivity. The t-test statistic, 5.053, is calculated as 0.223/0.044 (rounding errors explain the difference from the printed value of t). Other statistics shown in Table 14.1 are discussed below. The appropriate notation for this relationship is shown below. Either the t-test statistic or the standard error should be shown in parentheses, directly below the regression coefficient; analysts should state which statistic is shown. Here, we show the t-test statistic:3 The level of significance of the regression coefficient is indicated with asterisks, which conforms to the p-value legend that should also be shown. Typically, two asterisks are used to indicate a 1 percent level of significance, one asterisk for a 5 percent level of significance, and no asterisk for coefficients that are insignificant.4 Table 14.1 Simple Regression Output Note: SEE = standard error of the estimate; SE = standard error; Sig. = significance.”
Evan M. Berman, Essential Statistics for Public Managers and Policy Analysts
“knowledge of cardiovascular disease and whether such knowledge reduces behaviors that put people at risk for cardiovascular disease. Simple regression is used to analyze the relationship between two continuous variables. Continuous variables assume that the distances between ordered categories are determinable.1 In simple regression, one variable is defined as the dependent variable and the other as the independent variable (see Chapter 2 for the definitions). In the current example, the level of knowledge obtained from workshops and other sources might be measured on a continuous scale and treated as an independent variable, and behaviors that put people at risk for cardiovascular disease might also be measured on a continuous scale and treated as a dependent variable. Scatterplot The relationship between two continuous variables can be portrayed in a scatterplot. A scatterplot is merely a plot of the data points for two continuous variables, as shown in Figure 14.1 (without the straight line). By convention, the dependent variable is shown on the vertical (or Y-) axis, and the independent variable on the horizontal (or X-) axis. The relationship between the two variables is estimated as a straight line relationship. The line is defined by the equation y = a + bx, where a is the intercept (or constant), and b is the slope. The slope, b, is defined as Figure 14.1 Scatterplot or (y2 – y1)/(x2 – x1). The line is calculated mathematically such that the sum of distances from each observation to the line is minimized.2 By definition, the slope indicates the change in y as a result of a unit change in x. The straight line, defined by y = a + bx, is also called the regression line, and the slope (b) is called the regression coefficient. A positive regression coefficient indicates a positive relationship between the variables, shown by the upward slope in Figure 14.1. A negative regression coefficient indicates a negative relationship between the variables and is indicated by a downward-sloping line. Test of Significance The test of significance of the regression coefficient is a key test that tells us whether the slope (b) is statistically different from zero. The slope is calculated from a sample, and we wish to know whether it is significant. When the regression line is horizontal (b = 0), no relationship exists between the two variables. Then, changes in the independent variable have no effect on the dependent variable. The following hypotheses are thus stated: H0: b = 0, or the two variables are unrelated. HA: b ≠ 0, or the two variables are (positively or negatively) related. To determine whether the slope equals zero, a t-test is performed. The test statistic is defined as the slope, b, divided by the standard error of the slope, se(b). The standard error of the slope is a measure of the distribution of the observations around the regression slope, which is based on the standard deviation of those observations to the regression line: Thus, a regression line with a small slope is more likely to be statistically significant when observations lie closely around it (that is, the standard error of the observations around the line is also small, resulting in a larger test statistic). By contrast, the same regression line might be statistically insignificant when observations are scattered widely around it. Observations that lie farther from the”
Evan M. Berman, Essential Statistics for Public Managers and Policy Analysts
“Simple Regression   CHAPTER OBJECTIVES After reading this chapter, you should be able to Use simple regression to test the statistical significance of a bivariate relationship involving one dependent and one independent variable Use Pearson’s correlation coefficient as a measure of association between two continuous variables Interpret statistics associated with regression analysis Write up the model of simple regression Assess assumptions of simple regression This chapter completes our discussion of statistical techniques for studying relationships between two variables by focusing on those that are continuous. Several approaches are examined: simple regression; the Pearson’s correlation coefficient; and a nonparametric alterative, Spearman’s rank correlation coefficient. Although all three techniques can be used, we focus particularly on simple regression. Regression allows us to predict outcomes based on knowledge of an independent variable. It is also the foundation for studying relationships among three or more variables, including control variables mentioned in Chapter 2 on research design (and also in Appendix 10.1). Regression can also be used in time series analysis, discussed in Chapter 17. We begin with simple regression. SIMPLE REGRESSION Let’s first look at an example. Say that you are a manager or analyst involved with a regional consortium of 15 local public agencies (in cities and counties) that provide low-income adults with health education about cardiovascular diseases, in an effort to reduce such diseases. The funding for this health education comes from a federal grant that requires annual analysis and performance outcome reporting. In Chapter 4, we used a logic model to specify that a performance outcome is the result of inputs, activities, and outputs. Following the development of such a model, you decide to conduct a survey among participants who attend such training events to collect data about the number of events they attended, their knowledge of cardiovascular disease, and a variety of habits such as smoking that are linked to cardiovascular disease. Some things that you might want to know are whether attending workshops increases”
Evan M. Berman, Essential Statistics for Public Managers and Policy Analysts
“A NONPARAMETRIC ALTERNATIVE A nonparametric alternative to one-way ANOVA is Kruskal-Wallis’ H test of one-way ANOVA. Instead of using the actual values of the variables, Kruskal-Wallis’ H test assigns ranks to the variables, as shown in Chapter 11. As a nonparametric method, Kruskal-Wallis’ H test does not assume normal populations, but the test does assume similarly shaped distributions for each group. This test is applied readily to our one-way ANOVA example, and the results are shown in Table 13.5. Table 13.5 Kruskal-Wallis’ H-Test of One-Way ANOVA Kruskal-Wallis’ H one-way ANOVA test shows that population is significantly associated with watershed loss (p = .013). This is one instance in which the general rule that nonparametric tests have higher levels of significance is not seen. Although Kruskal-Wallis’ H test does not report mean values of the dependent variable, the pattern of mean ranks is consistent with Figure 13.2. A limitation of this nonparametric test is that it does not provide post-hoc tests or analysis of homogeneous groups, nor are there nonparametric n-way ANOVA tests such as for the two-way ANOVA test described earlier. SUMMARY One-way ANOVA extends the t-test by allowing analysts to test whether two or more groups have different means of a continuous variable. The t-test is limited to only two groups. One-way ANOVA can be used, for example, when analysts want to know if the mean of a variable varies across regions, racial or ethnic groups, population or employee categories, or another grouping with multiple categories. ANOVA is family of statistical techniques, and one-way ANOVA is the most basic of these methods. ANOVA is a parametric test that makes the following assumptions: The dependent variable is continuous. The independent variable is ordinal or nominal. The groups have equal variances. The variable is normally distributed in each of the groups. Relative to the t-test, ANOVA requires more attention to the assumptions of normality and homogeneity. ANOVA is not robust for the presence of outliers, and it appears to be less robust than the t-test for deviations from normality. Variable transformations and the removal of outliers are to be expected when using ANOVA. ANOVA also includes three other types of tests of interest: post-hoc tests of mean differences among categories, tests of homogeneous subsets, and tests for the linearity of mean differences across categories. Two-way ANOVA addresses the effect of two independent variables on a continuous dependent variable. When using two-way ANOVA, the analyst is able to distinguish main effects from interaction effects. Kruskal-Wallis’ H test is a nonparametric alternative to one-way ANOVA. KEY TERMS   Analysis of variance (ANOVA) ANOVA assumptions Covariates Factors Global F-test Homogeneous subsets Interaction effect Kruskal-Wallis’ H test of one-way ANOVA Main effect One-way ANOVA Post-hoc test Two-way ANOVA Notes   1. The between-group variance is”
Evan M. Berman, Essential Statistics for Public Managers and Policy Analysts
“Beyond One-Way ANOVA The approach described in the preceding section is called one-way ANOVA. This scenario is easily generalized to accommodate more than one independent variable. These independent variables are either discrete (called factors) or continuous (called covariates). These approaches are called n-way ANOVA or ANCOVA (the “C” indicates the presence of covariates). Two way ANOVA, for example, allows for testing of the effect of two different independent variables on the dependent variable, as well as the interaction of these two independent variables. An interaction effect between two variables describes the way that variables “work together” to have an effect on the dependent variable. This is perhaps best illustrated by an example. Suppose that an analyst wants to know whether the number of health care information workshops attended, as well as a person’s education, are associated with healthy lifestyle behaviors. Although we can surely theorize how attending health care information workshops and a person’s education can each affect an individual’s healthy lifestyle behaviors, it is also easy to see that the level of education can affect a person’s propensity for attending health care information workshops, as well. Hence, an interaction effect could also exist between these two independent variables (factors). The effects of each independent variable on the dependent variable are called main effects (as distinct from interaction effects). To continue the earlier example, suppose that in addition to population, an analyst also wants to consider a measure of the watershed’s preexisting condition, such as the number of plant and animal species at risk in the watershed. Two-way ANOVA produces the results shown in Table 13.4, using the transformed variable mentioned earlier. The first row, labeled “model,” refers to the combined effects of all main and interaction effects in the model on the dependent variable. This is the global F-test. The “model” row shows that the two main effects and the single interaction effect, when considered together, are significantly associated with changes in the dependent variable (p < .000). However, the results also show a reduced significance level of “population” (now, p = .064), which seems related to the interaction effect (p = .076). Although neither effect is significant at conventional levels, the results do suggest that an interaction effect is present between population and watershed condition (of which the number of at-risk species is an indicator) on watershed wetland loss. Post-hoc tests are only provided separately for each of the independent variables (factors), and the results show the same homogeneous grouping for both of the independent variables. Table 13.4 Two-Way ANOVA Results As we noted earlier, ANOVA is a family of statistical techniques that allow for a broad range of rather complex experimental designs. Complete coverage of these techniques is well beyond the scope of this book, but in general, many of these techniques aim to discern the effect of variables in the presence of other (control) variables. ANOVA is but one approach for addressing control variables. A far more common approach in public policy, economics, political science, and public administration (as well as in many others fields) is multiple regression (see Chapter 15). Many analysts feel that ANOVA and regression are largely equivalent. Historically, the preference for ANOVA stems from its uses in medical and agricultural research, with applications in education and psychology. Finally, the ANOVA approach can be generalized to allow for testing on two or more dependent variables. This approach is called multiple analysis of variance, or MANOVA. Regression-based analysis can also be used for dealing with multiple dependent variables, as mentioned in Chapter 17.”
Evan M. Berman, Essential Statistics for Public Managers and Policy Analysts
“Scheffe tests also produce “homogeneous subsets,” that is, groups that have statistically identical means. Both the three largest and the three smallest populations have identical means. The Tukey levels of statistical significance are, respectively, .725 and .165 (both > .05). This is shown in Table 13.3. Figure 13.2 Group Boxplots Table 13.2 ANOVA Table Third, is the increase in means linear? This test is an option on many statistical software packages that produces an additional line of output in the ANOVA table, called the “linear term for unweighted sum of squares,” with the appropriate F-test. Here, that F-test statistic is 7.85, p = .006 < .01, and so we conclude that the apparent linear increase is indeed significant: wetland loss is linearly associated with the increased surrounding population of watersheds.8 Figure 13.2 does not clearly show this, but the enlarged Y-axis in Figure 13.3 does. Fourth, are our findings robust? One concern is that the statistical validity is affected by observations that statistically (although not substantively) are outliers. Removing the seven outliers identified earlier does not affect our conclusions. The resulting variable remains normally distributed, and there are no (new) outliers for any group. The resulting variable has equal variances across the groups (Levene’s test = 1.03, p = .38 > .05). The global F-test is 3.44 (p = .019 < .05), and the Bonferroni post-hoc test similarly finds that only the differences between the “Small” and “Large” group means are significant (p = .031). The increase remains linear (F = 6.74, p = .011 < .05). Thus, we conclude that the presence of observations with large values does not alter our conclusions. Table 13.3 Homogeneous Subsets Figure 13.3 Watershed Loss, by Population We also test the robustness of conclusions for different variable transformations. The extreme skewness of the untransformed variable allows for only a limited range of root transformations that produce normality. Within this range (power 0.222 through 0.275), the preceding conclusions are replicated fully. Natural log and base-10 log transformations also result in normality and replicate these results, except that the post-hoc tests fail to identify that the means of the “Large” and “Small” groups are significantly different. However, the global F-test is (marginally) significant (F = 2.80, p = .043 < .05), which suggests that this difference is too small to detect with this transformation. A single, independent-samples t-test for this difference is significant (t = 2.47, p = .017 < .05), suggesting that this problem may have been exacerbated by the limited number of observations. In sum, we find converging evidence for our conclusions. As this example also shows, when using statistics, analysts frequently must exercise judgment and justify their decisions.9 Finally, what is the practical significance of this analysis? The wetland loss among watersheds with large surrounding populations is [(3.21 – 2.52)/2.52 =] 27.4 percent greater than among those surrounded by small populations. It is up to managers and elected officials to determine whether a difference of this magnitude warrants intervention in watersheds with large surrounding populations.10”
Evan M. Berman, Essential Statistics for Public Managers and Policy Analysts
“suffered greater wetland loss than watersheds with smaller surrounding populations. Most watersheds have suffered no or only very modest losses (less than 3 percent during the decade in question), and few watersheds have suffered more than a 4 percent loss. The distribution is thus heavily skewed toward watersheds with little wetland losses (that is, to the left) and is clearly not normally distributed.6 To increase normality, the variable is transformed by twice taking the square root, x.25. The transformed variable is then normally distributed: the Kolmogorov-Smirnov statistic is 0.82 (p = .51 > .05). The variable also appears visually normal for each of the population subgroups. There are four population groups, designed to ensure an adequate number of observations in each. Boxplot analysis of the transformed variable indicates four large and three small outliers (not shown). Examination suggests that these are plausible and representative values, which are therefore retained. Later, however, we will examine the effect of these seven observations on the robustness of statistical results. Descriptive analysis of the variables is shown in Table 13.1. Generally, large populations tend to have larger average wetland losses, but the standard deviations are large relative to (the difference between) these means, raising considerable question as to whether these differences are indeed statistically significant. Also, the untransformed variable shows that the mean wetland loss is less among watersheds with “Medium I” populations than in those with “Small” populations (1.77 versus 2.52). The transformed variable shows the opposite order (1.06 versus 0.97). Further investigation shows this to be the effect of the three small outliers and two large outliers on the calculation of the mean of the untransformed variable in the “Small” group. Variable transformation minimizes this effect. These outliers also increase the standard deviation of the “Small” group. Using ANOVA, we find that the transformed variable has unequal variances across the four groups (Levene’s statistic = 2.83, p = .41 < .05). Visual inspection, shown in Figure 13.2, indicates that differences are not substantial for observations within the group interquartile ranges, the areas indicated by the boxes. The differences seem mostly caused by observations located in the whiskers of the “Small” group, which include the five outliers mentioned earlier. (The other two outliers remain outliers and are shown.) For now, we conclude that no substantial differences in variances exist, but we later test the robustness of this conclusion with consideration of these observations (see Figure 13.2). Table 13.1 Variable Transformation We now proceed with the ANOVA analysis. First, Table 13.2 shows that the global F-test statistic is 2.91, p = .038 < .05. Thus, at least one pair of means is significantly different. (The term sum of squares is explained in note 1.) Getting Started Try ANOVA on some data of your choice. Second, which pairs are significantly different? We use the Bonferroni post-hoc test because relatively few comparisons are made (there are only four groups). The computer-generated results (not shown in Table 13.2) indicate that the only significant difference concerns the means of the “Small” and “Large” groups. This difference (1.26 - 0.97 = 0.29 [of transformed values]) is significant at the 5 percent level (p = .028). The Tukey and Scheffe tests lead to the same conclusion (respectively, p = .024 and .044). (It should be noted that post-hoc tests also exist for when equal variances are not assumed. In our example, these tests lead to the same result.7) This result is consistent with a visual reexamination of Figure 13.2, which shows that differences between group means are indeed small. The Tukey and”
Evan M. Berman, Essential Statistics for Public Managers and Policy Analysts
“The Scheffe test is the most conservative, the Tukey test is best when many comparisons are made (when there are many groups), and the Bonferroni test is preferred when few comparisons are made. However, these post-hoc tests often support the same conclusions.3 To illustrate, let’s say the independent variable has three categories. Then, a post-hoc test will examine hypotheses for whether . In addition, these tests will also examine which categories have means that are not significantly different from each other, hence, providing homogeneous subsets. An example of this approach is given later in this chapter. Knowing such subsets can be useful when the independent variable has many categories (for example, classes of employees). Figure 13.1 ANOVA: Significant and Insignificant Differences Eta-squared (η2) is a measure of association for mixed nominal-interval variables and is appropriate for ANOVA. Its values range from zero to one, and it is interpreted as the percentage of variation explained. It is a directional measure, and computer programs produce two statistics, alternating specification of the dependent variable. Finally, ANOVA can be used for testing interval-ordinal relationships. We can ask whether the change in means follows a linear pattern that is either increasing or decreasing. For example, assume we want to know whether incomes increase according to the political orientation of respondents, when measured on a seven-point Likert scale that ranges from very liberal to very conservative. If a linear pattern of increase exists, then a linear relationship is said to exist between these variables. Most statistical software packages can test for a variety of progressive relationships. ANOVA Assumptions ANOVA assumptions are essentially the same as those of the t-test: (1) the dependent variable is continuous, and the independent variable is ordinal or nominal, (2) the groups have equal variances, (3) observations are independent, and (4) the variable is normally distributed in each of the groups. The assumptions are tested in a similar manner. Relative to the t-test, ANOVA requires a little more concern regarding the assumptions of normality and homogeneity. First, like the t-test, ANOVA is not robust for the presence of outliers, and analysts examine the presence of outliers for each group. Also, ANOVA appears to be less robust than the t-test for deviations from normality. Second, regarding groups having equal variances, our main concern with homogeneity is that there are no substantial differences in the amount of variance across the groups; the test of homogeneity is a strict test, testing for any departure from equal variances, and in practice, groups may have neither equal variances nor substantial differences in the amount of variances. In these instances, a visual finding of no substantial differences suffices. Other strategies for dealing with heterogeneity are variable transformations and the removal of outliers, which increase variance, especially in small groups. Such outliers are detected by examining boxplots for each group separately. Also, some statistical software packages (such as SPSS), now offer post-hoc tests when equal variances are not assumed.4 A Working Example The U.S. Environmental Protection Agency (EPA) measured the percentage of wetland loss in watersheds between 1982 and 1992, the most recent period for which data are available (government statistics are sometimes a little old).5 An analyst wants to know whether watersheds with large surrounding populations have”
Evan M. Berman, Essential Statistics for Public Managers and Policy Analysts
“categorical and the dependent variable is continuous. The logic of this approach is shown graphically in Figure 13.1. The overall group mean is (the mean of means). The boxplots represent the scores of observations within each group. (As before, the horizontal lines indicate means, rather than medians.) Recall that variance is a measure of dispersion. In both parts of the figure, w is the within-group variance, and b is the between-group variance. Each graph has three within-group variances and three between-group variances, although only one of each is shown. Note in part A that the between-group variances are larger than the within-group variances, which results in a large F-test statistic using the above formula, making it easier to reject the null hypothesis. Conversely, in part B the within-group variances are larger than the between-group variances, causing a smaller F-test statistic and making it more difficult to reject the null hypothesis. The hypotheses are written as follows: H0: No differences between any of the group means exist in the population. HA: At least one difference between group means exists in the population. Note how the alternate hypothesis is phrased, because the logical opposite of “no differences between any of the group means” is that at least one pair of means differs. H0 is also called the global F-test because it tests for differences among any means. The formulas for calculating the between-group variances and within-group variances are quite cumbersome for all but the simplest of designs.1 In any event, statistical software calculates the F-test statistic and reports the level at which it is significant.2 When the preceding null hypothesis is rejected, analysts will also want to know which differences are significant. For example, analysts will want to know which pairs of differences in watershed pollution are significant across regions. Although one approach might be to use the t-test to sequentially test each pair of differences, this should not be done. It would not only be a most tedious undertaking but would also inadvertently and adversely affect the level of significance: the chance of finding a significant pair by chance alone increases as more pairs are examined. Specifically, the probability of rejecting the null hypothesis in one of two tests is [1 – 0.952 =] .098, the probability of rejecting it in one of three tests is [1 – 0.953 =] .143, and so forth. Thus, sequential testing of differences does not reflect the true level of significance for such tests and should not be used. Post-hoc tests test all possible group differences and yet maintain the true level of significance. Post-hoc tests vary in their methods of calculating test statistics and holding experiment-wide error rates constant. Three popular post-hoc tests are the Tukey, Bonferroni, and Scheffe tests.”
Evan M. Berman, Essential Statistics for Public Managers and Policy Analysts
“Analysis of Variance (ANOVA)   CHAPTER OBJECTIVES After reading this chapter, you should be able to Use one-way ANOVA when the dependent variable is continuous and the independent variable is nominal or ordinal with two or more categories Understand the assumptions of ANOVA and how to test for them Use post-hoc tests Understand some extensions of one-way ANOVA This chapter provides an essential introduction to analysis of variance (ANOVA). ANOVA is a family of statistical techniques, the most basic of which is the one-way ANOVA, which provides an essential expansion of the t-test discussed in Chapter 12. One-way ANOVA allows analysts to test the effect of a continuous variable on an ordinal or nominal variable with two or more categories, rather than only two categories as is the case with the t-test. Thus, one-way ANOVA enables analysts to deal with problems such as whether the variable “region” (north, south, east, west) or “race” (Caucasian, African American, Hispanic, Asian, etc.) affects policy outcomes or any other matter that is measured on a continuous scale. One-way ANOVA also allows analysts to quickly determine subsets of categories with similar levels of the dependent variable. This chapter also addresses some extensions of one-way ANOVA and a nonparametric alternative. ANALYSIS OF VARIANCE Whereas the t-test is used for testing differences between two groups on a continuous variable (Chapter 12), one-way ANOVA is used for testing the means of a continuous variable across more than two groups. For example, we may wish to test whether income levels differ among three or more ethnic groups, or whether the counts of fish vary across three or more lakes. Applications of ANOVA often arise in medical and agricultural research, in which treatments are given to different groups of patients, animals, or crops. The F-test statistic compares the variances within each group against those that exist between each group and the overall mean: Key Point ANOVA extends the t-test; it is used when the independent variable is”
Evan M. Berman, Essential Statistics for Public Managers and Policy Analysts

« previous 1