WEEK 5 - Multiple Group Designs and ANOVA Flashcards
Why more than 2 groups?
In the real world, independent variables can have any number of categories/levels
There are many situations where more than two groups are needed to give us full information about the relationship between IV and DV.
What is a one-way ANOVA
means one IV (for purpose of this week only discussing one independent variable but multiple categories/groups on independent variable)
Often, we know that a difference exists between groups, or that our IV affects DV (benefit of ANOVA –> multiple levels to IV)
➢ e.g., Several studies have shown that football players have poorer cognitive performance than non-contact athletic controls
➢ But this does not tell us why these differences exist
➢ We could seek to separate football players on the basis of the number of head injuries sustained; zero, one, or two +.
➢ Forming multiple groups can help
➢ refine our understanding of how an IV operates on our DV
➢ evaluate dose-response relationships
What are the common relationships between the IV and DV
linear
curved
Quadratic
What is a linear relationship?
where as the Independent variable increases so does the dependent variable
- eg. effect of alcohol consumption on brain cell death
What is a curved relationship?
plateau function.
where the Independent variable increases so does the dependent variable but then it eventually plateaus without much change
- eg. effect of strength training on endurance performance
What is a quadric relationship?
Something that goes up and then goes down again
* eg. effect of anxiety on exam performance
When designing a ANOVA how do you chose the number of levels of your IV?
- determined by type of relationship expected
- linear - at least three points
- curvilinear – even more
When designing a ANOVA how do you chose how far apart should levels be? (this can vary a lot)
- proportionately across spectrum
- eg. Drug dose: 1mg, 4mg, 7mg, 10mg
- Allows for clear examination of levels of the IV
- Of course this only applies to IVs that are actually based on measurement, rather than categories.
least number of categories but sensitive enough to be able to detect what you are looking for
Philosophy of analysis (why cant we just continue to use T-tests)
Why not just use multiple t-tests to test all the possible
group differences?
* These add up very quickly!
* 3 groups: 3 separate t-tests (1&2, 1&3, 2&3)
* 4 groups: 6 separate t-tests (1&2, 1&3, 1&4, 2&3, 2&4, 3&4)
* 5 groups: 10 different comparisons
* 6 groups: 15 comparisons!
Most importantly, our Type 1 error rate would increase
* In each t-test, we are potentially wrong 5% of the time (if we use the typical 0.05 criterion)
* So with multiple t-tests, our actual error rate will be (much) greater than 0.05 – NOT GOOD
What does an Analysis of Variance (ANOVA) tell you?
*Tells you whether a difference exists somewhere among a set of group means
- If you find a significant difference you can follow up which groups differ specifically. These follow up tests will be explored in a later lecture.
Revision what was the basic objective of the independent groups t-test
to determine whether the difference seen between two group means is large enough for us to be reasonably convinced that it is not due to random error or chance
What is multiple differences
When you have more than two groups, we are not just looking at the difference between 2 things.
We are looking at multiple differences. (example, AB, BC, AD etc)
Multiple differences = variance
What is the foundation of ANOVA which is represented by the F ratio?
In an ANOVA we form a statistical ratio similar to the t-test but representing variance between groups rather than just a single difference
𝑭 = 𝑩𝒆𝒕𝒘𝒆𝒆𝒏 − 𝒈𝒓𝒐𝒖𝒑𝒔 𝒗𝒂𝒓𝒊𝒂𝒏𝒄𝒆 /
𝑾𝒊𝒕𝒉𝒊𝒏 − 𝒈𝒓𝒐𝒖𝒑𝒔 𝒗𝒂𝒓𝒊𝒂𝒏𝒄𝒆
What does variance between groups (BG) represent in an ANOVA
representing variance due to the effect of IV i.e. the differences between our means of each condition
What does variance within groups (WG) represent in an ANOVA
representing the difference between individual scores within each condition.
➢This isn’t to do with our manipulation, so it is simply ‘random’ variability (stuff we don’t understand) – aka ‘error
How do ANOVA’s actually work?
We already know how to calculate the variance for all of our scores (our Total Variance). So what we will do is:
- Calculate the variance within each of the groups separately and pool them (to get a pooled Within-Groups variance)
- Calculate the variance between the group means to find the Between-Groups variance
How do you find the Between-Group Variability (BG) in ANOVA
- We need to calculate variance between group means
- Calculate this variance by looking at how group means vary around the grand mean (aka overall experiment mean)
- Find the variance of group means around the grand mean
How do you find the Within-Group Variability (WG) in ANOVA
The variability within the groups is very simply calculated
within each one of the groups. So it’s basically taking each person’s score and subtracting the mean for the group instead of the grand mean alright. And then S squaring and summing those deviations
What does sums of squares being ‘additive’ mean?
Total variation = between groups + within groups
SStotal SSbetween SSwithin
What are the df for the total?
Total df = N-1 (big N represents everyone in the experiment)
What are the df for the between groups?
Between-groups df = a-1 (where a is the number of groups)
What are the df for Within-groups (error)
Within-groups (error) df = (N-1) – (a-1) –> simplified to N - a
*total df - between group df
What is a means Square (MS)
just means variance
What actually is F?
A statistic that represents the ratio of the between-groups variability and the within groups variability
What is the null hypothesis in ANOVA
*it is that ALL mean differences are non-significant (all means are exactly the same)
- all group means are equal
- IV has no effect on DV
What is the alternate hypothesis in ANOVA
Says that at least one of these means are different to the others
If the null hypothesis is true then F ration is going to be equal to 1
In ANOVA , if the null hypothesis is true what will the F ratio be?
1 because both of those things represent sampling error
BG/WG = error/error = 1
If the alternate hypothesis is true what will the F ratio be?
greater than 1
BG/ WG = E + Treatment / E > 1
F distribution
*If H0 is true (null hypothesis) , then F should equal 1 (BG=WG=Error)
*However with normal sample sizes there will always be some variation due to chance.
*Null hypothesis: Imagine we took 5 groups at a time from the same population, and for each set of 5 groups we recorded the F we observed. We could get a sampling distribution of that F.
*Just like t, there is a mathematical description of that distribution.
*Just like t, the F distributions are a family of distributions: 1 for each specific combination of dfbetween and dfwithin
What is sig in SPSS?
SPSS provides a null hypothesis test in the table – the number marked ‘Sig.’
Represents the probability of obtaining an F this large from purely random chance if the null hypothesis was true.
This is a probability (1 would represent certainty, 0 would represent something impossible). Here the value is 0.005 so less than a 1% chance of obtaining this result from random data.
In ANOVA what is considered signficant?
α = 0.05
How do you report F?
F(dfBetween, dfWithin) = 2 decimal places, p = 3 decimal places with no leading zero
What are the important ethical principles of data retention
- Avoid needlessly proliferating personal information about a participant
- Personal information should ideally be stored in just one place and
− Securely
− Offline (or password protected / encrypted) - Personal information should be stored separately to the data for the participant.
− For instance a consent form (which might include a participant’s name) should be stored separately from their data
− Data record for a participant can be coded using a unique identifier (which could be used to link to their personal information only for those who have the link)