The bias in linear models (L7) Flashcards
What are standardized beta values?
Tell you something about the change in the outcome associated with a uni change in the predictors, where the values are expressed as SDs (making it easy to compare multiple predictors).
What is the hierarchical method of selecting predictors?
Experimenter driven - decides the order the parameters are added to the model. Useful for theory testing. Generally, known predictors entered into the model first.
What is forced entry method of selecting predictors?
Enter all the parameters at once and see what happens.
What is stepwise method of selecting predictors?
Statistically select using semi-partial correlation with outcome, used only for exploratory analysis. First predictor should be the one which has the highest correlation with the outcome, second is the second biggest correlation etc etc.
What are residual statistics?
Difference between model and data to asses accuracy of model.
What are influential cases?
Model doesn’t fit specific cases very dramatically (NOT outliers).
What percentage of standardized residuals should lie between +/- 1.96 SDs of the mean?
95%.
What percentage of standardized residuals should lie between +/- 2.5 SDs of the mean?
99%
What is an outlier?
Case for the absolute value of the standardized residual is +/- 3 SDs away from the mean.
Why are influential cases bad?
They alter the model (ie. the gradient changes if they are included in the model).
What is Cook’s distance?
Value produced for every data case to quantify it’s influence on the model (done through calculating the the model with and without each data set to assess differences in b values). Should be less than 1. .
What are the assumptions of linear models?
1) Must be a continuous outcome.
2) Predictor variables should be continuous.
3) Non-zero variance (predictor values must vary)
4) Independence (error should be uncorrelated)
5) No multi-collinarity (high correlation between predictors).
What are ZRESID and ZPRED?
Assess homogeneity of variance, comparison of residuals. Don’t want to see any patterns in the output. Funnel shaped = homodescedasticity, boomerang = non-linearity.
How do we diagnose co-linearity?
Tolerance and VIF.
What is tolerance?
1/VIF; should be > 0.2