Chapter 13 Flashcards
Y hat = Ŷ
Ŷ
Define a Simple Linear Regression
AKA: The Prediction Line
- Ŷi* = predicted value of Y for observation i
- Xi* = value of X for observation i
- b0* = sample Y intercept
- b1* = sample slope
Review the Computational Formula for the Slope, b1
See page 433 and 434 in book
When using a regression model, only use what?
The relevant range of the independent variable in making predictions
i..e. you can interpolate with the relevant range of X, but you cannot extrapolate
Symbols for Y Intercept and Slope in Regression Testing
Y intercept = b0
Slope = b1
Define the Least-squares method
A method that minimizes the sum of the squared differences between the actual values (Yi) and the predicted values (Yi)
Define the three measures of Variation
Needed when using the least-square method to determine the regression coefficients
- SST - Total Sum of Squares
- SSR - Regression Sum of Squares (This is the explained variation)
- SSE - Error Sum of Squares(This is the unexplained variation)
SST = SSR + SSE
Define the Total Sum of Squares (SST)
Define the Regression Sum of Squares (SSR)
Define the Error Sum of Squares (SSE)
Define the Coefficient of Determination
Measures the proportion of variation of Y that is explained by the variation in the independent variable X
The range of r2 is from 0-1
The greater the value, the more the variation in Y in the regression model can be explained by the variation in X
Review Computational Forumlas p.440
Do I need these equations?
Define the Standard Error of the Estimate
The standard deviation around the prediction line (i.e. Measures the variability of the observed Y values from the predicted Y values)
What are the 4 Assumptions of Regression
L.I.N.E.
- Linearity (Rel’t b/t the variables is linear)
- Independence of errors (requires that the errors, εi, are independent of one another)
- Normality of error (requires that the errors, εi, be independent of each other)
- Equal variance (requires that the variance of the errors, εi, by constant for all values of X)
Define Residual Analysis
Visually evaluates the assumption of a set of data to determine whether the regression model selected is appropriate
The residual, or estimated error value, ei, is the difference b/t the observed (Yi) and predicted (Ŷi) values of the dependent variable for a given value of Xi
Evaluate the Assumption of Linearity
- Plot the residuals on the vertical axis against the corresponding X, values of the independent variable on the horizontal axis
- If it’s appropriate, there will be no apparent pattern in the plot
- If there is, you will need another model (e.g. quadratic or curvilinear)
(See attached for examples of not appropriate plots
Evaluate the Assumption of Independence
- Evaluate the assumption of independence of the errors by plotting the residuals in the order or sequence in which the data were collected.
- If the show a cyclical pattern, then the assumption does not fit
Evaluate the Assumption of Normality
Evaluate the assumption of normality in the errors by constructing a histogram or normal probability plot.
Standard normality
Evaluate the Assumption of Equal Variance
- You can evaluate the assumption of equal variance from a plot of the residuals with Xi.
- There should appear to be no difference in the variability
For an example where there is a difference in variability, see image