Repeated Measures ANOVA Flashcards
Treatment vs subject variability:
In a between-subjects design different individuals participate in each treatment. Therefore, you would not expect there to be correlations between treatment pairs. In a within- subjects design the same subjects are used in all treatment groups and correlations between treatment pairs would not be equal to zero. For example, a subject that performs well under one condition would also be expected to perform well in other conditions. In repeated-measures ANOVA, we are specifically interested in removing any subject differences across the treatments from our error term (the denominator of the F statistic). This gives you an overall smaller error term, which means you are more likely to detect smaller treatment effects.
The way in which subject differences can impact a treatment effect is illustrated in the following example:
Treatment
Subject123Mean12474.33210121311.67322293027.00430313431.67Mean16192118.67
In the above example the dependent variable is the number of trials to criterion on some task. You will note that there is a difference in the means across the three treatments (16, 19, and 21) but the difference is not large. However, you will note a larger difference in the means for subjects (4.33, 11.67, 27.00, 31.67). Subject 1 has performed better under all conditions but Subject 3 and 4 have done very poorly.
If we could remove the subject differences, we could have a better (and smaller) estimate of error. It would be possible to eliminate individual differences from the analysis by subtracting the subject’s mean from each of their scores thus artificially creating scores for each subject which have the same mean (which would be zero). In other words, the mean performance of each subject is the same: we have eliminated this source of variability between scores. This computational procedure, while effective, would be laborious. Fortunately, a simpler procedure is used in repeated-measures ANOVA to remove the variability due to individual differences between subjects from the within-subjects sum of squares. This is done by treating subjects as an extra variable in a factorial-design ANOVA. The main effect of subjects can then be calculated so that the variability due to subjects is treated like a separate treatment effect (SSB). And this is removed from the SStotal.
The logic of repeated-measures ANOVA:
In ANOVA in general, we seek to partition all of the variance in a DV into variance due to treatment/group or model (this is the effect), and remaining/unexplained variance (in this context, equating to within-group variance). Field visually illustrates in Figure 15.3 (2018 edition, page 659) and Figure 14.3 (2013 edition, page 550) that, in repeated measures ANOVA, variance is partitioned a bit differently. This difference is due, primarily, to the fact that we have the same participants across conditions, meaning that some of the variance within one condition is also common across others!
Consequently, formulae covered in this topic are slightly different in appearance to that covered in earlier ANOVA topics, yet the goal remains the same: calculate the effect of condition, and determine whether this is significant and meaningful.
Formulae for a one-way repeated-measures ANOVA:
here is really nothing new in carrying out a one-way repeated-measures ANOVA. It is just like a two-way ANOVA with subjects treated as the second variable:
- You do not compute an overall within-cells error term because, as mentioned above, there is only one score per cell so the term cannot be calculated. The interaction of the treatment and subject factors is used as the error term.
- The main effect of subjects is not tested (although you do compute the SSB). This is because there is no error term quite appropriate for this F-value (what we would need is an error term which did not include the treatment x subject interaction, which cannot be calculated because there is only one score for each subject), so the F-value will be biased (not accurate).
So the formulae we need for sums of squares and dfs are only a little different from those we used for a two-way ANOVA. They are:
Sphericity assumption:
Before an independent-groups ANOVA can be conducted, certain assumptions need to be fulfilled, the principle assumption being the assumption of homogeneity of variance. For repeated-measures ANOVA, a slightly different assumption must be satisfied before an analysis can proceed: sphericity. The sphericity assumption is satisfied when there is equality of variances of the differences between pairs of treatment levels. Field and others also talk about the fact that when compound symmetry exists (i.e., homogeneity of variances and covariances) in your data, you will necessarily meet the assumption of sphericity. It should be noted, however, that compound symmetry has a stricter set of requirements for the data (homogeneity of variances and covariances across time points) than is necessary to ensure that the sphericity assumption is met. So, think of the equality of variances of differences (which, when met, ensures that you have homogeneity of covariances) as the bare minimum requirement to satisfy the sphericity assumption of repeated measures ANOVA.
That being said, what exactly is equality of the differences between treatment levels? Field (2018) gives a good demonstration in Table 15.2 (page 655) and Table 14.1 (page 546, Field, 2013) and online (http://www.statisticshell.com/docs/sphericity.pdf). Assuming we collected data from participants at three time points. We could then calculate differences in participant scores across the time points (Time 1 – Time 2, Time 1 – Time 3, and Time 2 – Time 3). From this data, we can obtain estimates of variance in these difference scores (e.g., variance for Time 1 – Time 2). The sphericity assumption holds if these variances in differences between conditions (in this instance, time) are roughly equal to each other:
Variance(Time 1 – Time 2) ≈ Variance(Time 1 – Time 3) ≈ Variance(Time 2 – Time 3)
Note that we are testing pairs against each other, and using all possible combinations of pairings. We don’t test Variance(Time 2 – Time1), for example, since the variance estimate will be identical to Variance(Time 1 – Time 2), and hence, wouldn’t add anything meaningful. Note also that because we are comparing pairs of conditions, we need at least three time points for sphericity to become a relevant and essential assumption. If we only had two time points, we could calculate the variance in difference scores for Time 1 – Time 2 (or equivalently, Time 2 – Time 1), but what would we compare this value against? So, in a round-about way, I have emphasized an important point: the sphericity assumption applies in repeated-measures designs when we have three or more time points.
We test for the sphericity assumption using Mauchley’s W test. Mauchley’s W tests the tenability of the assumption that the variances in differences are equal. There is some leeway here, and the variances in difference do not need to be perfectly equal to each other. We use p < .05 to determine whether there is sufficient violation of sphericity to be concerned, and to warrant correction. If p < .05, we have significant violation of this assumption, and must take steps to protect our results against the threats posed by violation of sphericity. If p > .05, we can happily proceed to interpret the results as is.
For ANOVAs of independent groups designs, sphericity is not a problem because the groups are independent of each other, and therefore should be uncorrelated and have covariances which are all approximately zero (and hence all equal). However for repeated-measures ANOVAs it is a problem, because the same subjects are used in each treatment and hence their scores are likely to be correlated. The effect of violation of this assumption is that our F-ratio may not be valid, and this will lead to more Type I errors. If the assumption is violated, one means of continuing with the ANOVA involves a reduction in both the numerator (effect of time) and denominator (residual) degrees of freedom, which we use to work out the significance of an F-ratio. Lowering the degrees of freedom reduces the chance of a Type I error.
How to fix Sphericity
Now, noting that you have a problem with sphericity in your data, an obvious next question is, ‘How do I fix this problem?’
There are various ways of making this adjustment:
- The Greenhouse-Geisser correction. This involves estimation of ˆ∈ (or epsilon) which is a number between 0 and 1 which is used to multiply by the dfs and hence reduce them. It is important to note that the Greenhouse Geisser ranges from 1 (perfect sphericity, i.e., no violation at all as the variances of differences in pairs are exactly equal) to 1/(k-1), where k = number of conditions/time points (the closer you get to 1/k-1, the worse the violation from sphericity is).
However, when the value of ˆ∈ is greater than .75 (i.e., in the range of .75 to 1), Huynh and Feldt have shown that it is also inaccurate, and suggest that in that case a different correction should be used which is:
- The Huynh-Feldt correction factor ∼∈ which is similar to the Greenhouse-Geisser-Box epsilon, and is used instead of it when ˆ∈ is greater than .75.
You will never have to calculate these corrections yourself because Jamovi does them for you. However, we do expect you to know when and why you might make the corrections and which is appropriate to choose when.
Mixed designs:
The term “mixed” design is used by some statisticians to refer to designs which include a combination of at least one between-subjects variable and one within-subjects variable. An example of a design with one between-subjects variable and one within-subjects variable is given by Field. We will not focus on the calculations. However, over the pages that follow Field presents and discusses the output from analysing these data using SPSS . It is important to note that this output contains tests of both within-subject and between-subjects effects. Apart from this, the design resembles a two-factor design that we covered in Module 6. That is, there are two main effects and an interaction effect. In addition, if the interaction effect is significant, we can also examine simple effects. Again, although we are not focusing on the calculations for this design we want you to gain an appreciation of how it builds on the simpler designs we have already covered.