Regression Analysis Quotes

Rate this book
Clear rating
Regression Analysis: An Intuitive Guide for Using and Interpreting Linear Models Regression Analysis: An Intuitive Guide for Using and Interpreting Linear Models by Jim Frost
61 ratings, 4.44 average rating, 11 reviews
Regression Analysis Quotes Showing 1-30 of 33
“The p-value for each independent variable tests the null hypothesis that the variable has no relationship with the dependent variable.”
Jim Frost, Regression Analysis: An Intuitive Guide for Using and Interpreting Linear Models
“This graph shows all the observations together with a line that represents the fitted relationship. As is traditional, the Y-axis displays the dependent variable, which is weight. The X-axis shows the independent variable, which is height. The line is the fitted line. If you enter the full range of height values that are on the X-axis into the regression equation that the chart displays, you will obtain the line shown on the graph. This line produces a smaller SSE than any other line you can draw through these observations. Visually, we see that that the fitted line has a positive slope that corresponds to the positive correlation we obtained earlier. The line follows the data points, which indicates that the model fits the data. The slope of the line equals the coefficient that I circled. This coefficient indicates how much mean weight tends to increase as we increase height. We can also enter a height value into the equation and obtain a prediction for the mean weight. Each point on the fitted line represents the mean weight for a given height. However, like any mean, there is variability around the mean. Notice how there is a spread of data points around the line. You can assess this variability by picking a spot on the line and observing the range of data points above and below that point. Finally, the vertical distance between each data point and the line is the residual for that observation.”
Jim Frost, Regression Analysis: An Intuitive Guide for Using and Interpreting Linear Models
“Based on the mathematical relationship shown above, you know that R-squared can range from 0 – 100%. Zero indicates that the model accounts for none of the variability in the dependent variable around its mean. 100% signifies that the model explains all of that variability.”
Jim Frost, Regression Analysis: An Intuitive Guide for Using and Interpreting Linear Models
“Additionally, if you take RSS / TSS, you’ll obtain the percentage of the variability of the dependent variable around its mean that your model explains. This statistic is R-squared!”
Jim Frost, Regression Analysis: An Intuitive Guide for Using and Interpreting Linear Models
“Understanding this relationship is fairly straight forward. RSS represents the variability that your model explains. Higher is usually good. SSE represents the variability that your model does not explain. Smaller is usually good. TSS represents the variability inherent in your dependent variable. Or, Explained Variability + Unexplained Variability = Total Variability”
Jim Frost, Regression Analysis: An Intuitive Guide for Using and Interpreting Linear Models
“These three sums of squares have the following mathematical relationship: RSS + SSE = TSS”
Jim Frost, Regression Analysis: An Intuitive Guide for Using and Interpreting Linear Models
“The result is that an individual outlier can exert a strong influence over the entire model and, by itself, dramatically change the results.”
Jim Frost, Regression Analysis: An Intuitive Guide for Using and Interpreting Linear Models
“SSE is a measure of variability. As the points spread out further from the fitted line, SSE increases. Because the calculations use squared differences, the variance is in squared units rather the original units of the data. While higher values indicate greater variability, there is no intuitive interpretation of specific values. However, for a given data set, smaller SSE values signal that the observations fall closer to the fitted values. OLS minimizes this value, which means you’re getting the best possible line.”
Jim Frost, Regression Analysis: An Intuitive Guide for Using and Interpreting Linear Models
“OLS draws the line that minimizes the sum of squared errors (SSE). Hopefully, you’re gaining an appreciation for why the procedure is named ordinary least squares!”
Jim Frost, Regression Analysis: An Intuitive Guide for Using and Interpreting Linear Models
“This process produces squared residuals, which statisticians call squared errors.”
Jim Frost, Regression Analysis: An Intuitive Guide for Using and Interpreting Linear Models
“OLS regression squares those residuals so they’re always positive. In this manner, the process can add them up without canceling each other out.”
Jim Frost, Regression Analysis: An Intuitive Guide for Using and Interpreting Linear Models
“For a good model, the residuals should be relatively small and unbiased. In statistics, bias indicates that estimates are systematically too high or too low. Unbiased estimates are correct on average.”
Jim Frost, Regression Analysis: An Intuitive Guide for Using and Interpreting Linear Models
“Graphically, residuals are the vertical distances between the observed values and the fitted values. On the graph, the line represents the fitted values from the regression model. We call this line . . . the fitted line! The lines that connect the data points to the fitted line represent the residuals. The length of the line is the value of the residual.”
Jim Frost, Regression Analysis: An Intuitive Guide for Using and Interpreting Linear Models
“A residual is the distance between an observed value and the corresponding fitted value.”
Jim Frost, Regression Analysis: An Intuitive Guide for Using and Interpreting Linear Models
“Observed values of the dependent variable are the values of the dependent variable that you record during your study or experiment along with the values of the independent variables. These values are denoted using Y. Fitted values are the values that the model predicts for the dependent variable using the independent variables. If you input values for the independent variables into the regression equation, you obtain the fitted value. Predicted values and fitted values are synonyms.”
Jim Frost, Regression Analysis: An Intuitive Guide for Using and Interpreting Linear Models
“Continuous variables can take on almost any numeric value and can be meaningfully divided into smaller increments, including fractional and decimal values. You often measure a continuous variable on a scale. For example, when you measure height, weight, and temperature, you have continuous data. Categorical variables have values that you can put into a countable number of distinct groups based on a characteristic. Categorical variables are also called qualitative variables or attribute variables. For example, college major is a categorical variable that can have values such as psychology, political science, engineering, biology, etc.”
Jim Frost, Regression Analysis: An Intuitive Guide for Using and Interpreting Linear Models
“The low p-values indicate that both education and IQ are statistically significant. The coefficient for IQ (4.796) indicates that each additional IQ point increases your income by an average of approximately $4.80 while controlling everything else in the model. Furthermore, the education coefficient (24.215) indicates that an additional year of education increases average earnings by $24.22 while holding the other variables constant.”
Jim Frost, Regression Analysis: An Intuitive Guide for Using and Interpreting Linear Models
“values and coefficients are they key regression output. Collectively, these statistics indicate whether the variables are statistically significant and describe the relationships between the independent variables and the dependent variable. Low p-values (typically < 0.05) indicate that the independent variable is statistically significant. Regression analysis is a form of inferential statistics. Consequently, the p-values help determine whether the relationships that you observe in your sample also exist in the larger population. The coefficients for the independent variables represent the average change in the dependent variable given a one-unit change in the independent variable (IV) while controlling the other IVs.”
Jim Frost, Regression Analysis: An Intuitive Guide for Using and Interpreting Linear Models
“A beautiful aspect of regression analysis is that you hold the other independent variables constant by merely including them in your model!”
Jim Frost, Regression Analysis: An Intuitive Guide for Using and Interpreting Linear Models
“Regression analysis mathematically describes the relationships between independent variables and a dependent variable. Use regression for two primary goals: To understand the relationships between these variables. How do changes in the independent variables relate to changes in the dependent variable? To predict the dependent variable by entering values for the independent variables into the regression equation.”
Jim Frost, Regression Analysis: An Intuitive Guide for Using and Interpreting Linear Models
“Independent variables are the variables that you include in the model to explain or predict changes in the dependent variable.”
Jim Frost, Regression Analysis: An Intuitive Guide for Using and Interpreting Linear Models
“The dependent variable is a variable that you want to explain or predict using the model. The values of this variable depend on other variables. It’s also known as the response variable, outcome variable, and it is commonly denoted using a Y. Traditionally, analysts graph dependent variables and the vertical, or Y, axis.”
Jim Frost, Regression Analysis: An Intuitive Guide for Using and Interpreting Linear Models
“squared is a primary measure of how well a regression model fits the data. This statistic represents the percentage of variation in one variable that other variables explain. For a pair of variables, R-squared is simply the square of the Pearson’s correlation coefficient. For example, squaring the height-weight correlation coefficient of 0.705 produces an R-squared of 0.497, or 49.7%. In other words, height explains about half the variability of weight in preteen girls.”
Jim Frost, Regression Analysis: An Intuitive Guide for Using and Interpreting Linear Models
“What is a good correlation? How high should it be? These are commonly asked questions. I have seen several schemes that attempt to classify correlations as strong, medium, and weak. However, there is only one correct answer. The correlation coefficient should accurately reflect the strength of the relationship. Take a look at the correlation between the height and weight data, 0.705. It’s not a very strong relationship, but it accurately represents our data. An accurate representation is the best-case scenario for using a statistic to describe an entire dataset.”
Jim Frost, Regression Analysis: An Intuitive Guide for Using and Interpreting Linear Models
“correlation does not mean that the changes in one variable actually cause the changes in the other variable.”
Jim Frost, Regression Analysis: An Intuitive Guide for Using and Interpreting Linear Models
“Correlations have a hypothesis test. As with any hypothesis test, this test takes sample data and evaluates two mutually exclusive statements about the population from which the sample was drawn. For Pearson correlations, the two hypotheses are the following: Null hypothesis: There is no linear relationship between the two variables. ρ = 0. Alternative hypothesis: There is a linear relationship between the two variables. ρ ≠ 0. A correlation of zero indicates that no linear relationship exists. If your p-value is less than your significance level, the sample contains sufficient evidence to reject the null hypothesis and conclude that the correlation does not equal zero. In other words, the sample data support the notion that the relationship exists in the population.”
Jim Frost, Regression Analysis: An Intuitive Guide for Using and Interpreting Linear Models
“Pearson’s correlation measures only linear relationships. Consequently, if your data contain a curvilinear relationship, the correlation coefficient will not detect it.”
Jim Frost, Regression Analysis: An Intuitive Guide for Using and Interpreting Linear Models
“Pearson’s correlation coefficient is unaffected by scaling issues. Consequently, a statistical assessment is better for determining the precise strength of the relationship.”
Jim Frost, Regression Analysis: An Intuitive Guide for Using and Interpreting Linear Models
“When the value is in-between 0 and +1/-1, there is a relationship, but the points don’t all fall on a line. As r approaches -1 or 1, the strength of the relationship increases and the data points tend to fall closer to a line.”
Jim Frost, Regression Analysis: An Intuitive Guide for Using and Interpreting Linear Models

« previous 1