ANCOVA AND MANCOVA Flashcards
List 3 general points about ANCOVA
ANCOVA can be used with all types of ANOVA
Can even have a changing covariate in repeated measures design (but not in spss)
ANCOVA is equivalent to multiple regression
What is a covariate?
- a potential covariate is any variable (continuous as otherwise you would add it as a factor)that is significantly correlated with the outcome variable (DV)
- we assume a linear relationship between the covariate (x) and the DV (y)
(Null hypothesis: means on the DV do not differ significantly across groups after adjustment for scores on one or more covariate)
What is a common design that would use an ANCOVA method?
In a two group pretest-posttest design, the pretest can be used as a covariate because how a subject scores before treatment is usually correlated with how they score after treatment
Or
Two groups compared on some score of achievement, with iq as a covariate, because IQ may be correlated with achievement
If you remove a covariate, which will remove error from the error term underneath the f ratio, the f ratio will get bigger, what will happen to the t value?
It will get bigger and may then lead to a sig. Result
What do you do to a correlation to get the amount of variance in the DV explained by the iv?
In the context of ANCOVA: IQ and achievement
Square the correlation
E.g. IQ and achievement - 0.80
Squared = 64% of the variance in achievement is explained by IQ
So if you remove this variance you go from trying to explain a change in 100% of the variance in achievement by your intervention, to explaining 36% of achievement- so explaining some of that will have a bigger effect. Say 5% or a 100% compared to 5% of 36%
ANCOVA removes that portion from the error term and thus increases statistical power
For an ANCOVA analysis what happens to the sums of squares error term from analysis 1 to analysis 2
It gets smaller
Error term underneath the f test is what?
The effect/error
How are the effects of the IV’s assessed in an ANCOVA?
By holding covariate a constant (i.e. Treating each subject as if they scored at the mean of the covariate… So basically you adjust their achievement score to be at where it would be if they had scored at the mean of the covariate IQ)
What is an additional test that ANCOVA produces?
Provides a test of significance for the regression of the covariate (s) on the DV ignoring group effects.
There would be no point doing an ANCOVA if there wasn’t a significant relationship between covariate and the DV as you would actually be adding noise into your analysis not taking it away. You would probably have tested this already with a correlation
What are the assumptions of ANCOVA?
Usual ANOVA assumptions
-absense of outliers (both Univariate and multivariate outliers amounts DV’s and covariates)
- homogeneity of variances for DV and covariate (not so important if you have equal group sizes and not wildly disparate variances)
Usual MR assumptions:
Eliminate highly correlated covariates (multicollinearity and singularity)
Relationship between DV and covariate, and between covariates, should be linear
Additional for ANCOVA
Covariate is independent of treatment (in a random allocation this has to be true if this wasn’t the case it would be sampling error and one time in 20 this would be the case).
Homogeneity of the regression slopes (slopes for 1 covariate, planes for 2 covariates, hyper-planes for multiple covariates)
The covariate is measured without (much) error (reliable covariate)
If the assumption of outliers is violated in ANCOVA what does this mean?
This is a serious violation
What can you do if the relationship between the DV and the covariate is curvilinear? I.e. The assumption that this relationship should be linear is violated
- adjustment of the means will be improper it will be biased as it will assume linearity
- transform the data (only possible when the relationship is monotonic)
- fit a polynomial ANCOVA model to the data
What if the assumption that the covariate should not have much error is violated? i.e. The covariate has high measurement error
For randomised designs, power is reduced
For nonrandomised designs, the effect is serious
What are the preliminary data checks for an ANCOVA?
Examination of histograms for the DV on the covariate
(All distributions should be approximately normal & there should be no extreme outliers)
- check the homogeneity of variance assumption, esp. If group sizes are very different
-examination of scatter plots between the DV and the covariate (should be approx. linear)
- covariate should be measured before the onset of treatment (this makes sure the covariate is independent of the DV)
- covariate was measured reliably
How can you test the homogeneity of regression (slopes) in spss?
- remember that ANCOVA will fit a single regression equation to the analysis so if the slopes are different this can be a serious issue
include covariate-by-IV interaction term (s) in the model, as well as main effects (basically just like a moderation analysis in regression)
- if there interactions are sig. Then there is heterogeneity of regression and ANCOVA is inappropriate
- in spss, the model button allows you to specify the model
- note a ‘full factorial model’ (spss default) does not include interactions between covariates and IV’s (this model is full in the context of the factors but it doesn’t include covariates) you need to test the covariance of the covariate and the IV’s
How do you test homogeneity of variances in ANCOVA?
Levenes statistic in the DV and the covariate
If sig. Then this maybe an issue, but if there is a good sample size,equal groups and the largest sd is not twice as big then it’s okay (the largest variance needs to not be 4 times as big)
What theoretical issues are there in the context of choice of covariates
Every time you include a covariate you pay a price and that price is that the DF for the error term decrease which makes it harder to have a sig. Effect… So there is a point of diminishing returns.
Ideal is small number of orthogonal (uncorrelated) covariates, this means they are explaining different bits of the DV - each correlated with the DV.
Goal - maximum adjustment of the DV with minimum loss of DF
Covariates must be independent of treatment
(Data on covariates must be gathered before the treatment is started)
- covariates should be reliable (i.e. Covariate measurement is not much contaminated by noise since measurement noise distorts the resulting effects and it is very problematic for small samples)
What are the 3 uses for (M)ANCOVA
- to increase power by reducing error term in experimental work (with random assignment to groups)
- to adjust for mismatch on nuisance variables in non experimental work (this is the tricky case)
- stepdown analyses to follow-up MANOVA
If the covariate was measured post treatments what might this do?
Lose independence
If covariate is measured post-treatment and is affected by treatment, then the change in covariate will be correlated with the DV
Covariate adjustment like then remove part of the treatment effect
How many covariates?
For small group sizes max. Two or three covariates
Rule of thumb
Number of covariates + number of groups-1
Divided by
Total sample size
Should be less that 0.10
(If too many covariates are used, the estimates of adjusted means will be unstable)
Discuss theoretical issues in ANCOVA surrounding ransom vs. nonrandom assignment
In random assignment (experimental) designs, group differences in covariates will be due to chance (as long as covariates measures before assignment)
With nonrandom assignment (common in psychology) covariate differences may reflect meaningful substantive differences related to group membership (I.e. Depression and controls with a covariate of anxiety, non random groups but those with depression may be stronger anxiety scores)
Why is ANCOVA invalid when groups differ on covariate
ANCOVA looks at the relationship between the DV and the group (iv)
Don’t know what group represents when covariate and group are related
Group variable altered so that it may no longer measure what it was intended to measure
ANCOVA may move part of treatment effect or produce a spurious effect
What is lords (1967) paradox
E.g.
Weight of boys and girls at the beginning and end of an academic year
-observations i) boys weigh more than the girls at the start and at the end, I) the average weight of neither group doesn’t change over time
Question
Does diet affect boys and girls differentially?
Conclusion 1 - normal ANOVA diet has no effect on weight
ANCOVA initial weight as covariate
End weight as DV
Gender as iv
Conclusion 2: boys show more weight gain than girls when initial weight differences are adjusted
Basically it ends up being an artificial comparison which is what ANCOVA does of comparing the weight gain of underweight boys with the weight gain of overweight girls
You end up comparing a group variable that you are not interested in as its nonrandom assignment of groups
Can ANCOVA ever be valid with group differences on covariate?
If group differences arise by chance (e.g. In experiments with nonrandom assignment)
Overall and Woodward (1977): if group factor could not have caused the covariate differences
As a useful means of exploring the dataset and clarifying the relationship between the variable (but can’t say anything about causation if you do this)
What are alternatives to ANCOVA?
You can incorporate the covariate into the analysis - no longer as a covariate but as another substantive variable
-propensity score matching (basically employs a predictable probability of group membership (e.g. Treatment vs controls group) based on a set of observed covariates - requires large samples with substantial group overlap, hidden bias may remain)
You can also extend the regression equation in the control/healthy group for the DV and the covariate - analyse residual scores for other / patient group (extending the regression line requires broadening the range of covariate scores in the control group)
You can also do something called blocking
Subjects are measured on potential covariates and the grouped according to the scores (e.g. Into groups of high medium and low)
Groups then become level of another full-scale iv that are crossed with the levels of IV’s in factorial design
(Advantages - none of the assumptions of ANCOVA or within subjects ANOVA - the relationship between covariate and DV need not be linear - can be extended to multiple covariates)
What are the assumptions for MANCOVA?
A sig. Relationship between the set of DV’s and the set of covariates
Homogeneity of the regression hyperplanes
Null hypothesis: equality of adjusted population mean vectors
What are the practical issues around MANCOVA?
MANCOVA serves as a noise-reducing operation where variance associated with covariates is removed from error variance
In non- experimental designs, MANCOVA provides statistical matching of groups (but beware of difficulties in interpretations, just as with ANCOVA)
Check the assumptions (linearity, homogeneity of regression etc) beforehand
Violation has serious effects
If covariates are not reliable, increased type 1 and 2 error
Check for multicollinearity and singularity
What is specification error?
This is the term given to the idea that if preexisting groups differ systematically on more than the covariate. The covariate will leave those differences intact, thus biasing the estimate of the treatment… This is a problem with nonrandom assignment to groups
What is a common issue with interpretation on nonrandom group assignment?
It is impossible to determine whether a given pretreatment difference reflects random error or a true group difference. This uncertainty complicates interpretation of treatment effects because it’s impossible to distinguish a main effect of treatment from an interaction between the effects of treatment and the pretreatment difference, and from meaningful overlap (variance shared) between the treatment and the pretreatment characteristics.
Often overlooked is the overlap between pretreatment difference and the grouping factor
What was ANCOVA developed for?
To improve the power of the test of the independent variable not to CONTROL for anything
How can you tell if the covariate and the grouping variable share variance and what sort of design is most likely to give this result?
They will be correlated
A quasi-experimental design (nonrandom assignment of groups)
Why is it important to measure the covariate as reliably as possible?
To maximise its ability to capture noise variance in the DV
To ensure that the adjusted DV, DVres is not contaminated by noise associated with measurement of the covariate
This issue is particularly important in small sample sizes
If the assumption of homogeneity of regression slopes is violated i.e. Heterogeneity of regression slopes is found, how severe is the violation and what is a way around it?
Not that serious
Investigator is encouraged simple to frame the analysis as a hierarchical or simultaneous regression and in that context to include an interaction term consisting of the product of the covariate X group
(The covariate then becomes a meaningful part of the analysis)
What does the observed power tell us?
The observed power (aka post hoc power) is the chance of getting a sig. Result if you reproduced the study, if there was a difference between the populations which was exactly the same as the difference you observe in the samples, and if everything else was the same about the experiment including sample size.
Basically it doesn’t tell you anything more than the significance test. If you observed power is low p will not be sig.
Better to use an estimated effect size prior to the study to identify how much power you have to detect an effect of that assumed size by your experiment
When running an ANCOVA what aspects of the experimental design give you confidence to run the analysis?
Random allocation to groups gives confidence
Covariate tested before allocation
And check the correlations between the covariate and the DV if highly correlated then it suggests that an ANCOVA may be useful.
Where do you look in the output for the homogeneity of regression assumption in ANCOVA?
You look in the between-subjects effects box (univariate output) and you look at the interaction term
What do the estimated marginal means show?
They show the group means after adjusting for the covariate
From the multivariate output of an ANCOVA (I.e. MANCOVA) can you determine whether there is the necessary homogeneity of regression for this analysis?
Yes you look a the interaction term and if it’s not sig. Then there is homogeneity of regression slopes
What is the Roy-bargmann step down analysis?
These are carried out after finding a sig. Effect in manova
The RB tries to find out the group differences on the individual DV allowing for group differences on the DV.
One has to have a priori priority ordering of DV’s, begin with highest priority and test simple ANOVA (adjusting for total number of comparisons) then next highest and so on
Seeing if there is an effect of a particular DV even after removing the influence of a higher priority DV (like hierarchical regression)