CK020 - Linear Regression Flashcards
What are the assumptions of a linear regression model?
- Error terms are homoscedastic
- Error terms are uncorrelated
- Error terms are normally distributed
- Linearity
- No (multi)colinearity
What are ‘standardized residuals’ ?
Residuals where the heteroscedasticity is removed (so they are homoscedastric residuals)
What are ‘standardized residuals’ used for?
- Assessing homoscedasticity
- assessing normality of the residuals
What are ‘studentized residuals’ ?
Residuals with a known (t-) distribution
What are ‘studentized residuals’ used for?
Identifying outliers
How to check the ‘normality assumption’ ?
Plot standardized residuals (y) against the theoretical residuals (x) and check if this is a linear line.
What is ‘homoscedasticity’ ?
Standardized/studentized residuals are randomly spread around 0 with constant variance
How to check the ‘homoscedasticity assumption’ ?
By plotting the standardized/studentized residuals (y) against the fitted values/covariates (x)
What to do with ‘heteroscedasticity’ ?
- Variable transformation (not recommended)
- Weighted Least Squares
What is an ‘outlier’ ?
Any observation that does not ‘fit the model’
What is a ‘high leverage point’ ?
Observation with extreme predictor value(s)
What are ‘influential values’ ?
Observations that have excessive influence on the model
Why should you do model diagnostics?
To evaluate the assumptions are reasonable for the data at hand