Lecture 12 - ANOVA 1 Flashcards
Issues with multiple t-tests
Example: Are emotionally negative, positive and neutral images recognised with different speeds?
Potential approach: we could run 3 t-tests:
neutral vs. negative
neutral vs. positive
negative vs. positive
Problem: if each test has an alpha of 5%, the overall likelihood of error becomes (just under) 15%, which is too high
Alpha = criterion we set for when we consider it a true positive
Possible solution: we could reduce our alpha to 1.666% so that the combined error is still 5%, but that reduces our power (makes it harder for any one of our tests to show a significant result)
Analysis of Variance
Analysis of variance (ANOVA) is an extension of the t-test
Much more general – can cope with any number of conditions/factors/variables and levels within each one of those
It allows us to test whether 3 (or more) population means are the same, without reducing power
Assumptions of ANOVA
The scores were sampled randomly and are independent
Roughly normal distribution
Roughly equal number of participants in the groups (improved power if exactly the same – uses a different equation)
Roughly equal variance for each condition
These are true for all versions of the test
The basis of the test
Analysis of Variance is a way to compare multiple conditions in a single, powerful test
It was invented by Fisher (and so its test statistic is F)
It compares the amount of variance explained by our experiment with the variance that is unexplained
Between-groups ANOVA
The aim of ANOVA is to compare ‘the amount of variance explained by our experiment with the variance that is unexplained’
If the treatment affects participants (explained variance) or not (unexplained variance – poor theory)
For between-groups designs:
The explained variance is variance between groups (effect)
The unexplained variance is the variance within a group (noise)
Variance formula
More scores differ from the mean, larger the variance
This calculation is often referred to as the mean squared (MS) error
ANOVA is based on the F-ratio:
Degrees of freedom
There are degrees of freedom associated with both variance values:
Degrees of freedom between conditions
Residual degrees of freedom
ANOVA critical values require 2 df values, one for each aspect of the variance
Need multiple degrees of freedom depending on how many factors we have
Report both e.g. ‘we found a significant difference between the 3 groups’ scores (F(2,32)=15.3, p<.05)’
Large degrees of freedom alter the value of t
Pair-wise comparisons
ANOVA tells us whether groups differ or not, doesn’t tell us in what way they were different
How do we know which particular conditions?
Run the multiple comparisons (those we were trying to avoid)
Some of these are ‘planned comparisons’, some are ‘post-hoc’ tests
‘Planned comparisons’ = wanted to know about particular pairing in advance
‘Post hoc’ = decide to look at afterwards (people change their mind after they run their data)
Correct for multiple comparisons. There are many options. Easiest to understand is ‘Bonferroni correction’ (divide alpha criterion by the number of tests)
Versions of ANOVA
ANOVA:
One-way or one factor ANOVA
Multi-factor ANOVA (often referred to as e.g. 2x3 ANOVA)
E.g. different genders, amount of coffee etc.
One variable varying on multiple levels
About number of things manipulating
Multivariate Analysis of Variance (MANOVA)
Extension of ANOVA for multiple DVs
Run test for multiple things we measure
Analysis of Covariance (ANCOVA)
Extension of ANOVA to handle continuous variables (e.g. correlations)
Long continuous variable
Post-hoc tests
Only run/report post-hoc tests if main test found a significant result
Great many choices, heavily debated area
Bonferroni’s method:
Uses a t-test and then divided alpha by some value based on the number of tests
Very strict: very safe but not very powerful (why some people don’t use it)
ANOVA in SPSS - start with a plot
Describe data before analysing it
Graphs -> (Legacy Dialogues) -> Error Bar
More than one variable (2 x IVs) = clustered
Summaries for groups of cases = within-subjects design
Y-axis = variable
X-axis = category axis
Look at graph – e.g. points far apart and small error bars = more likely to show significant
ANOVA in SPSS
For a single IV = analyse -> compare means -> one-way ANOVA
General approach = analyse -> general linear model -> univariate (use this one)
IV is a fixed factor
Also running post-hoc tests (put group in post hoc tests and tick Bonferroni)
Output
F(group df, error df) = F, p..
Error = variance we didn’t explain
Reporting results
The reported valence of the images, according to the group (Positive, Negative and Neutral), can be seen in Fig 1. A one-way analysis of variance showed that the difference between the mean valence ratings was significant (F2,48=191.6, p<.001). Post-hoc pairwise comparisons (Bonferroni-corrected) showed significant differences between all pairs of categories (p<.001).
Glossary
General linear model (GLM) = the term SPSS often uses for ANOVA (the family of tests including ANOVA)
The unexplained variance is often referred to as the residual, or the error
One-way = one IV (becomes ‘two-way’ or ‘three-way’ etc. with more)
Univariate = one IV (becomes multivariate with more)
SS = sum of squares
MS = mean square