2. Linear Models Flashcards
Difference between prediction and confidence interval in MLR.
Confidence: range for the mean response
Prediction: range for a response value
What is the hierarchical principle with interaction terms in MLR?
A significant interaction term implies that its individual terms should also be in the model, regardless of the t tests associated with the individual terms
Model diagnostics: what is misspecified model equation? Give an example
Incorrectly assuming that the true form of f follows your model.
When there is evidence of a higher order polynomial relationship.
Model diagnostics: residuals with non-zero averages. What does this mean
Residuals are realizations of the true error terms that come from a normal distribution, this means that some aspect of linear regression is incorrect
Model diagnostics: heteroscedasticity. This leads to an unreliable _____
Variance of the error term is not constant, there is evidence of more than one variance parameters.
This leads to an unreliable MSE, then all outputs that rely on MSE are also unreliable
Model diagnostics: dependent errors, what does this mean in terms of the Y’s? The ___ will also be underestimated, leading to ____ CI and PI intervals.
This means that the Y’s have non-zero covariances.
The standard errors will be underestimated, this will make the intervals narrower and p-values smaller.
Model diagnostics: why is it bad if error terms are non-normal?
Then we are unable to make inferences based on the F and t distributions.
Model diagnostics: multicollinearity. What is this, and what does it lead to?
When one of the predictors is correlated with another predictor. This means that the estimates of the regression coefficients will be unstable.
Does multicollinearity affect the predictive power of y-hat, MSE or F test results?
No
Model diagnostics: unusual points. What are these and what do they do to the model?
Outliers: extreme residuals
High leverage point: observation with an unusual set of predictor values. The bj’s are sensitive to these points and could affect the shape of the model greatly.
Model diagnostics: high dimensions, what does this mean?
The model is too flexible, it overfits the data
Which of the following can challenge the interpretation of a regression coefficient?
Misspecified model equation, multicollinearity, high leverage points
Misspecified model equation does make interpreting the bj’s problematic.
Multicollinearity masks which predictors are actually meaningful to the model.
High leverage points have a strong influence over the bj’s.
Answer: all
What is the formula for leverage? In SLR?
Formula sheet
High leverage point is given by what inequality?
h > 3((p+1)/n)
What is a studentized residual? They can be a realization of what distribution?
A unit less version of a residual, they are the raw residuals divided by an appropriate standard error.
They can be a realization of a t distribution with df = n-p-1
What is the formula for Cooks Distance? What does it measure, what distribution is it a realization of? An observation has a typical influence if D = ?
Formula sheet
Measures influence, realization of the F distribution with ndf = p+1 and ddf = n-p-1
Typical influence if D = 1/n
In the plot of e vs y-hat; what makes residuals well behaved?
- Points are randomly scattered and lacking trends.
If the residuals seem to be acting as a function of y-hat, then the model is likely missing a predictor that can explain the trend.
Ex. U-shaped … add a positive quadratic term - Non-zero average of residuals.
Equally spread above and below the 0 line - Heteroscedasticity.
The residuals have inconsistent spread (cone-like shapes)
How may we solve the issue of heteroscedasticity?
Cone shape toward infinity: transform the response using log, or square root (any concave function)
Cone shape toward 0: weighted least squares
How may we solve the issue of dependent errors?
We use time series
How may we solve the issue of non-normal errors?
This occurs when the response is discrete in nature.
How may we solve the issue of multicollinearity?
- Exclude all but one of those predictors from the model
- Combine the predictors
- Do nothing and report it’s presence
- Use orthogonal predictors, then we know that they are uncorrelated
What is a suppressor variable? Should we add these to our model?
This is when multicollinearity is accepted.
This is a predictor that is weakly correlated with the response, but due to being related to other predictors, it enhances their usefulness. This means that adding a suppressor variable leads to a better model, even if it produces multicollinearity.
What happens when the residuals exhibit a predictable pattern from observation to observation?
Use a time series model
What is the e vs i plot used to detect?
Dependence of error terms
Is forward selection a greedy approach?
Yes, because it only adds the next best predictor p as p increases, rather than finding the best subset of predictors