10. Simple Linear Regression Flashcards
Analysis of variance (ANOVA)
A table that presents the sums of squares, degrees of freedom, mean squares, and F-statistic for a regression model.
Coefficient of determination
(R^2) The percentage of the variation of the dependent variable that is explained by the independent variable. It is a measure of goodness of fit of a regression model.
Error term
Represents the difference between the observed value of the independent variable and that expected from the true underlying population relation between the dependent and independent variable.
Estimated parameters
In a simple linear regression: the intercept and slope of the fitted line.
Heteroskedasticity
Non-constant variance across all observations.
Homoskedasticity
Constant variance across all observations.
Indicator variable
A variable that takes on only one of two values, 0 or 1, based on a condition. In simple linear regression, the slope is the difference in the dependent variable for the two conditions. Also referred to as a dummy variable.
Lin-log model
A functional form for transforming regression model data in which the dependent variable is linear but the independent variable is logarithmic.
Log-lin model
A functional form for transforming regression model data in which the dependent variable is logarithmic but the independent variable is linear.
Log-log model
A functional form for transforming regression model data in which both the dependent and independent variables are in logarithmic form.
Mean square error (MSE)
Calculated as the sum of squares error (SSE) divided by the degrees of freedom, which are the number of observations minus the number of independent variables minus one. Since simple linear regression has just one independent variable, the degrees of freedom calculation is the number of observations minus 2.
Mean square regression (MSR)
Calculated as the sum of squares regression (SSR) divided by the number of independent variables in the regression model. In simple linear regression, there is only one independent variable, so MSR equals SSR.
Regression analysis
Allows us to test hypotheses about the relationship between two variables, by quantifying the strength of the relationship between the two variables, and to use one variable to make predictions about the other variable.
Regression coefficients
The collective term for the intercept and slope coefficients in the regression model.
Residual
The amount of deviation of an observed value of the dependent variable from its estimated value based on the fitted regression line.
Simple linear regression (SLR)
An approach for estimating the linear relationship between a dependent variable and a single independent variable by minimizing the sum of the squared deviations between the fitted line and the observed values.
Slope coefficient
The change in the estimated value of the dependent variable for a one-unit change in the value of the independent variable.
Standard error of the estimate
A measure of the distance between the observed values of the dependent variable and those predicted from the estimated regression. The smaller this value, the better the fit of the model. Also known as the standard error of the regression and the root mean square error.
Standard error of the forecast
Used to provide an interval estimate around the estimated regression line. It is necessary because the regression line does not describe the relationship between the dependent and independent variables perfectly.
Standard error of the slope coefficient
Calculated for simple linear regression by dividing the standard error of the estimate by the square root of the variation of the independent variable.
Sum of squares error (SSE)
A measure of the total deviation between observed and estimated values of the dependent variable. It is calculated by subtracting each estimated value Y^i from its corresponding observed value Yi, squaring each of these differences, and then summing all of these squared differences.
Sum of squares regression (SSR)
A measure of the explained variation in the dependent variable, calculated as the sum of the squared differences between the predicted value of the dependent variable, Y ̂_i, based on the estimated regression line, and the mean of the dependent variable, Y ̅.