Fundamentals of Predictive Analytics with JMP
Rate it:
Read between January 2 - March 20, 2018
26%
Flag icon
Bayesian information criterion (BIC)
26%
Flag icon
in order to use a categorical
26%
Flag icon
variable in a regression model, you must transform the categorical variables into continuous variables or integer (binary) variables. The resulting variables from this transformation are called indicator or dummy variables.
26%
Flag icon
in JMP this is not the case. If a categorical variable has two categories (or levels) such as gender, then a single dummy variable is used with values of +1 and -1 (with +1 assigned to the alphabetically first category). If a categorical variable has more than two categories (or levels), the dummy variables are assigned values +1, 0, and −1.
27%
Flag icon
Analysis of variance (more commonly called ANOVA),
27%
Flag icon
is a dependence multivariate technique. There are several variations of ANOVA, such as one-factor (or one-way) ANOVA, two-factor (or two-way) ANOVA, and so on, and also repeated measures ANOVA.
27%
Flag icon
The factors are the independent variables, each of which must be a categorical variable. The dependent variable is one continuous variable.
28%
Flag icon
If H0 is true, you would expect all the sample means to be close to each other and relatively close to the grand mean. If H1 is true, then at least one of the sample means would be significantly different.
28%
Flag icon
This within-sample variability is measured by the sum of squares within
28%
Flag icon
groups (or error) (SSE).
28%
Flag icon
(In JMP: TSS, SSBG, and SSE are identified as C.Total, Model SS, and Error SS, respectively.)
28%
Flag icon
If H0 of the F test is rejected, which implies that one or more of the population means are significantly different, you then proceed to the second part of an ANOVA study and identify which factor level means are significantly different.
28%
Flag icon
An additional plus of ANOVA is, if we are examining the relationship of two or more factors, ANOVA is good at uncovering any significant interactions or relationships among these factors.
28%
Flag icon
One-way ANOVA has one dependent variable and one X factor.
28%
Flag icon
The horizontal line across the entire plot represents the overall mean. Each factor level has its own mean diamond. The horizontal line in the center of the diamond is the mean for that level. The upper and lower vertices of the diamond represent the upper and lower 95% confidence limit on the mean, respectively.
28%
Flag icon
the horizontal width of the diamond is relative to that level’s (group’s) sample size. That is, the wider the diamond, the larger the sample size for that level relative to the other levels.
28%
Flag icon
The overall steps to evaluate an ANOVA model are as follows: 1.   Conduct an F test. a.   If you do not reject H0 (the p-value is not small), then stop because there is no difference in means. b.   If you do reject H0, then go to Step 2. 2.   Consider unequal variances; look at the Levine test. a.   If you reject H0, then go to Step 3 because the variances are unequal. b.   If you do not reject H0, then go to Step 4. 3.   Conduct Welch’s test, which tests differences in means, assuming unequal variances. a.   If you reject H0, because the means are significantly different, then go to Step 4. ...more
28%
Flag icon
ANOVA has three statistical assumptions to check and address. The first assumption is that the residuals should be independent.
28%
Flag icon
unless there is strong concern about the dependence of the residuals, this assumption does not have to be checked.
28%
Flag icon
The second statistical assumption is that the variances for each level are equal. Violation of this assumption is of more concern because it could lead to erroneous p-values and hence incorrect statistical conclusions.
28%
Flag icon
However, if, as it is in this
28%
Flag icon
case, there are only two groups tested, and then an F test for unequal variance is also performed.
28%
Flag icon
If you fail to reject H0 (that is, you have a large p-value), you have insufficient evidence to say tha...
This highlight has been truncated due to consecutive passage length restrictions.
28%
Flag icon
On the other hand, if you reject H0, the variances can be assumed to be unequal, and the ANOV...
This highlight has been truncated due to consecutive passage length restrictions.
28%
Flag icon
The third statistical assumption is that the residuals should be normally distributed.
29%
Flag icon
if slight departures from normality are detected, they will have no real effect on the F statistic. A normal quantile plot can confirm whether the residuals are normally distributed or not.
29%
Flag icon
If all the residuals fall on or near the straight line or within the confidence bounds, the residuals should be considered normally distributed.
29%
Flag icon
The p-value for the F test is <.0001, so one or more of the Process means differ from each other.
29%
Flag icon
In general, the Levene test is more widely used and more comprehensive, so you focus
29%
Flag icon
only on the Levene test.
29%
Flag icon
Because the p-value for the Welch’s Test is small, you can reject the null hypothesis; the pairs of means are different from one another.
29%
Flag icon
If you had not rejected the Welch’s test, then, it would not be recommended that you perform these second-stage tests.
29%
Flag icon
Since the p-value for the Welch’s Test is small, you can reject the null hypothesis; the means are significantly different from one another.
29%
Flag icon
the likelihood of a Type I error increases with the number of pairwise comparisons.
29%
Flag icon
unless the number of pairwise comparisons is small, this test is not recommended.
29%
Flag icon
If the main objective is to check for any possible pairwise difference in the mean values, and there are several factor levels, the Tukey HSD (honest significant difference) also called Tukey-Kramer HSD test is the most desired test.
29%
Flag icon
To identify mean differences, examine the Connecting Letters Report. Groups that do not share the same letter are significantly different from one another.
29%
Flag icon
The Hsu’s MCB (multiple comparison with best) is used to determine whether each factor level mean can be rejected as the “best” of all the other means, where “best” means either a maximum or minimum value.
29%
Flag icon
The p-value report and the maximum and minimum LSD (Least Squares Differences) matrices can be used to identify significant differences. The p-value report identifies whether a factor level mean is significantly different from the maximum and from the minimum of all the other means.
30%
Flag icon
Differences with the Hsu’s MCB test are less conservative than those found with the Tukey-Kramer test.
30%
Flag icon
Hsu’s MCB test should be used if there is a need to make specific inferences about the maximum or minimum values.
30%
Flag icon
the With Control, Dunnett’s, is applicable when you do not wish to make all pairwise comparisons, but rather only to compare one of the levels (the “control”) with each other level.
30%
Flag icon
Two-way ANOVA is an extension of the one-way ANOVA in which there is one continuous dependent variable, but, now you have two categorical independent variables.
30%
Flag icon
There are three basic two-way ANOVA designs: without replication, with equal replication, and with unequal replication.
30%
Flag icon
Only the two-way ANOVA with equal replication is discussed here.
30%
Flag icon
If there were no significant interaction, the lines in the LSMeans Plot would not cross and would be mostly parallel.
31%
Flag icon
linear regression cannot be used for a binary dependent
31%
Flag icon
variable.
31%
Flag icon
Consequently, statisticians have developed a specialized form of regression call logistic regressio...
This highlight has been truncated due to consecutive passage length restrictions.
31%
Flag icon
Logistic regression,