Model Assumptions Flashcards
What do we assume in an effects model?
i.i.d N(0, sigma^2)
In order of importance:
1) Residuals (errors) are normally distributed
2) Residuals (errors) have constant variance
3) Residuals(errors) are independent
Things to know about independence of error terms
- Very important! ( but not our focus right now)
- Sometimes violated when data are collected sequentially or spatially
- Can check with residuals time series model (or other type of order model)
Sometimes our error variance can depend on…
i, or the factor level!
We can have different amounts of variance in different factor levels. (heteroskedasticity)
How do we assess our residuals (error term) for heteroskedasticity?
Plot the residuals vs. predicted! (e_ij vs. y_ij-hat) (or Y_i•-bar in CRD)
If there is no trend, then our assumption is okay.
(Bad trend: megaphone)
What is a modified Levene test? What does it test?
Numerical Test
H_0: sigma_1^2 = sigma_2^2 = … = sigma_g^2
Where sigma_i^2 is the error variance for factor level i
When is the modified Levene test useful?
When we specifically want to test to see if different factor levels have differing variances. But for model assumptions, it’s best to just use graphical verifications.
To check the normality of error terms (residuals) we can do…
a normal probability plot (Q-Q plot) on the residuals.
What is an extreme form of non-normality?
Outliers!
• Check for these visually
• Be reluctant to throw out valid data
What is the Rstudent?
Standardized residuals!
How do we check for outliers (visually)
Rstudent vs. Predicted Value
Look for RStudent values far outside the reference lines
What do we do if the independence assumption is violated?
Identify and account for the dependence structure.
Then fit a different model.
What do we do if the constant variance or normality assumptions are violated?
1) Transform the response variable (Y)
What is the “power” family of transformations?
…, -Y^-2, -Y^-1, -Y^-1/2, log(Y), Y_1/2, Y, Y^2
Box-Cox rule for transformations
If Box-Cox approach says to use lambda ≠0:
- do sign(lambda)Y^2
If Box-Cox approach says to use lambda = 0:
- do log(Y)
What do we do if the independence assumption is violated?
Identify and account for the dependence structure.
Then fit a different model.