Assumptions Flashcards
What are the 3 main assumptions, and how are they different to anova?
1) Independent observations
2) observations on the DVs follow a MULTIVARIATE NORMAL DISTRIBUTION in each group (anova is just normal distribution on the DV in each group)
3) population covariance matrices of DVs for the groups are equal (HOMOGENEITY OF VARIANCE-COVARIANCE MATRICES) (anova is just homogeneity of variance)
1) independence assumption
When the treatment is independently administered, observations are independent (i.e. don’t want interaction between individuals, such as discussion method of group counselling where observations might influence each other) (Glass and Hopkins, 1984)
Why are independent observations important?
Important because:
A violation is serious
dependent observations happen a lot is social science research
:( a small amount of dependence among observations can cause the actual p to be higher than observed p - dependence inflates the p value
What should be done with correlated (dependent) observations?
Test at a more stringent alpha level (a= .01 or lower), realising that the actual error rate will be .05 or higher
Use multivariate analysis technique, like hierarchical/ multilevel modelling
If there are several small groups (e.g. counselling) involved in each treatment, combine these and use the the group mean as the unit of measurement (sample size will be reduced, but won’t cause a drastic drop in power as the means are more stable then individual observations)
2) Normality asumption
All the DVs have normal distributions
Any linear combo of DVs has a normal distribution
All sunsets of the set of variable have a multivariate normal distribution
All pairs of variables must be bivariate normal
How is MV normality different to UV normality?
MV normality if much stricter than UV (violating UV normality only has a sight effect on level of significance or power in anova, the F statistic is only slightly effected)
How can you check for multivariate normality?
Use matrix scatterplots, showing each level of the IV with each DV
Should be elliptical
Effect of Multivariate normality violation
:) Like with UV anova, there is not much effect on Type 1 error (for up to 10 variables, for small. moderate sample sizes)
:( affects power- the severity of the effect increases as the violation spreads from one to all groups
The effect is larger from smaller sample sizes
What are the 3 homogeneity assumptions?
1) Homogeneity of variance (anova and manova)
2) Homogeneity of variance-covariance matrices (manova)
3) Homogeneity of covariance (sphericity, repeated measures anova)
When is the F statistic robust against heterogeneous variances?
When group sizes are equal
Heterogeneity with unequal groups with very different population variances
if the largest variance is in the small group, the F stat is liberal (actual p> observed p)
If large variance is in with the large group, the F stat is conservative (actual p< observed p)
Which tests for homogeneity of variance?
:( Bartlett’s and Cochran’s are quite sensitive to non-normality
:) Levene’s test of equality of error variances is more robust against non-normality
Box’s test: (a generalisation of Bartlett’s univariance homogeneity of variance test
- quite sensitive to non-normality
- check for normality before using the test
- if it’s not normal, do a transformation THEN do the test
Homogeniety of variance-covariance
e.g. a 3x3 variance-covariance matrix
2 matrices are equal only if all corresponding elements are equal
When is heterogeneity especially bad?
For very unequal group sizes, even mild heterogeneity can inflate Type 1 error
How do you assess the DVs?
Univariate anovas with statistical correction
If you have a priori priority order of the DVs, you can do a stepdown analysis (after the main effect in manova, do ancovas on low-priority DVs with high-priority DVs as covariates to see if low-priority DVs provide additional group separation beyond what the high-priority ones do)