Discovering Statistics Using IBM SPSS Statistics: North American Edition
Rate it:
50%
Flag icon
Each of these trends has a set of codes for the dummy variables in the model, so we are doing the same thing that we did for planned contrasts except that the codings have already been devised to represent the type of trend of interest.
50%
Flag icon
Post hoc tests consist of pairwise comparisons that are designed to compare all different combinations of the treatment groups.
50%
Flag icon
pairwise comparisons control the familywise error by correcting the level of significance for each test such that the overall Type I error rate (α) across all comparisons remains at 0.05.
50%
Flag icon
it is important that you know which post hoc tests perform best according to three important criteria.
50%
Flag icon
does the test control the Type I error rate?
50%
Flag icon
does the test control the Type I...
This highlight has been truncated due to consecutive passage length restrictions.
50%
Flag icon
is the test ...
This highlight has been truncated due to consecutive passage length restrictions.
50%
Flag icon
If a test is too conservative then we are likely to reject differences between means that are, in reality, meaningful.
50%
Flag icon
The choice of comparison procedure will depend on the exact situation you have and whether it is more important for you to keep strict control over the familywise error rate or to have greater statistical power.
50%
Flag icon
Because one-way independent ANOVA is a linear model with a different label attached,
50%
Flag icon
SPSS Statistics will test for the trend requested and all lower-order trends, so with Quadratic selected we’ll get tests both for a linear and a quadratic trend.
50%
Flag icon
If the Coefficient Total is anything other than zero you should go back and check that the contrasts you have planned make sense and that you have followed the appropriate rules for assigning weights.
51%
Flag icon
You can ask for descriptive statistics, which will produce a table of the means, standard deviations, standard errors, ranges and confidence intervals within each group.
51%
Flag icon
Some people use Levene’s test as a decision rule: if it is significant, read the part of the table labeled Does not assume equal variances, and if it is not, use the part of the table labeled Assume equal variances.
51%
Flag icon
The HSD stands for ‘honestly significant difference’,
51%
Flag icon
Tukey’s LSD uses the harmonic mean sample size, which is a weighted version of the mean that takes account of the relationship between variance and sample size.
52%
Flag icon
we often use a slightly more complex measure called omega squared (ω2). This effect size estimate is still based on the sums of squares that we’ve met in this chapter: it uses the variance explained by the model, and the average error variance: (12.32)
52%
Flag icon
The dfM in the equation is the degrees of freedom for the effect, which you can get from the output.
77%
Flag icon
logistic regression – a model for predicting categorical outcomes from categorical and continuous predictors.
77%
Flag icon
When we are trying to predict membership of only two categories the model is known as binary logistic regression,
77%
Flag icon
when we want to predict membership of more than two categories we use multinomial (or polychotomous) logistic regression.
77%
Flag icon
One of the assumptions of the linear model is that the relationship between the predictors and outcome is linear
77%
Flag icon
transform the data using the logarithmic transformation (see Berry & Feldman, 1985), which is a way of expressing a non-linear relationship in a linear way.
77%
Flag icon
In logistic regression, instead of predicting the value of a variable Y from a predictor variable X1 or several predictor variables (Xs), we predict the probability of Y occurring, P(Y), from known (log-transformed) values of X1 (or Xs).
77%
Flag icon
logistic regression model with one predictor: (20.3) P(Y) is the probability of Y occurring, e is the base of natural logarithms, and the linear model that you’ve seen countless times already (equation (20.1)) is cosily tucked up inside the parentheses.
77%
Flag icon
A value close to 0 means that Y is very unlikely to have occurred, and a value close to 1 means that Y is very likely to have occurred.
77%
Flag icon
in logistic regression maximum-likelihood estimation is used, which selects coefficients that make the observed values most likely to have occurred.
77%
Flag icon
The logistic regression model predicts the probability of an event occurring for a given person (we denote this as P(Yi), the probability that Y occurs for the ith person), based on observations of whether the event did occur for that person (we could denote this as Yi, the observed outcome for the ith person).
77%
Flag icon
the observed Y will be either 0 (the outcome didn’t occur) or 1 (the outcome did occur), but the predicted Y, P(Y), will be a value between 0 (there is no chance that the outcome will occur) and 1 (the outcome will certainly occur).
77%
Flag icon
log-likelihood: (20.5) The log-likelihood is based on summing the probabilities associated with the predicted, P(Yi), and actual, Yi, outcomes.
77%
Flag icon
that large values of the log-likelihood statistic indicate poorly fitting statistical models, because the larger the value of the log-likelihood, the more unexplained observations there are.
77%
Flag icon
The deviance is closely related to the log-likelihood: it’s given by (20.6) The deviance is often referred to as −2LL because of the way it is calculated.
77%
Flag icon
One important use of the log-likelihood and deviance is to compare models.
77%
Flag icon
it’s useful to compare a logistic regression model against a baseline state – usually the model when only the intercept is included (i.e., no predictors).
77%
Flag icon
our baseline model is the model that gives us the best prediction when we know nothing other than the values of the outcome: in logistic regression this will be the outcome category that occurs most often, which is the same as predicting the outcome from the intercept.
77%
Flag icon
If we add one or more predictors to the model, we can compute the improvement of the model as: (20.7)
77%
Flag icon
The number of parameters in the baseline model will always be 1 (the constant is the only parameter); any subsequent model will have degrees of freedom equal to the number of predictors plus 1
77%
Flag icon
If we build up models hierarchically (i.e., adding one predictor at a time) we can also use equation (20.7) to compare these models.
78%
Flag icon
The likelihood ratio is similar in that it is based on the level of correspondence between predicted and observed values of the outcome.
78%
Flag icon
It is the partial correlation between the outcome variable and each of the predictor variables and it varies between −1 and +1.
78%
Flag icon
A positive value indicates that as the predictor variable increases, so does the likelihood of the event occurring.
78%
Flag icon
A negative value implies that as the predictor value increases, the likelihood of the ou...
This highlight has been truncated due to consecutive passage length restrictions.
78%
Flag icon
a predictor variable has a small value of R then it contributes only a sma...
This highlight has been truncated due to consecutive passage length restrictions.
78%
Flag icon
To compute R use the following equation: (20.8) in which the −2LL is the deviance for the original model, the Wald statistic (z) is calculated as described in Section 20.3.4, and the degrees of freedom can be read from the SPSS output for the variables in the equation.
78%
Flag icon
Hosmer and Lemeshow’s (1989) measure, , is calculated by dividing the model chi-square, which represents the change from the baseline (based on the log-likelihood) by the baseline −2LL (the deviance of the model before any predictors were entered): (20.9)
78%
Flag icon
It is a measure of how much the badness of fit improves as a result of the inclusion of the predictor variables. It can vary between 0 (indicating that the predictors are useless at predicting the outcome variable) and 1 (indicating that the model predicts the outcome variable perfectly).
78%
Flag icon
IBM SPSS Statistics uses Cox and Snell’s (1989) measure, , which is based on the deviance of the model (−2LLnew), the deviance of the original model (−2LLbaseline), and the sample size, n: (20.11)
78%
Flag icon
In logistic regression there is an analogous statistic—the z-statistic—which follows the normal distribution. Like the t-statistic in the linear model, the z-statistic tells us whether the b-value for that predictor is significantly different from zero.
78%
Flag icon
The z-statistic was developed by Abraham Wald (Figure 20.2), and is known as the Wald statistic. SPSS Statistics reports the Wald statistic as z2, which transforms it so that it has a chi-square distribution.
78%
Flag icon
If you use a stepwise method then the backward method is preferable because forward methods are more likely to exclude predictors involved in suppressor effects.