One-way ANOVA Flashcards
When we have multiple levels of the IV, why can’t we simply use multiple t-tests?
For example, say we have 4 IV levels
We would need to run 6 t-tests to compare all means two at a time
This increases the probability of TYPE 1 errors i.e. finding an effect when the null is actually true
Remember that each test has a 5% probability of getting a significant result and if we conduct several such tests on a single data set this probability increases
When would we use a one-way ANOVA?
Independent design, testing for differences between 3 or more conditions of one independent variable
Requires interval data and requires fulfilment of all other parametric assumptions
What are the null and alternative hypothesis?
NULL - populations from which samples randomly drawn have equal means
ALTERNATIVE - means do not equal each other
How do ANOVAs differ from t-tests?
t-tests use DIFFERENCES BETWEEN SAMPLE MEANS but ANOVA uses variance among the scores contributing to those means - if variance within a group is very low compared with difference between means (variance of the means) we can be more confident that differences among scores are caused by the different levels of IV
What do ANOVAs do?
Compare variance BETWEEN groups with variance WITHIN groups - if there is a significant effect we expect within-group variation to be small compared to between-group i.e. variation of sample means
What is within-group variance thought to be attributed to?
Individual differences and random factors rather than the IV - this is referred to as ERROR
We want this variance to be minimal so as not to cloud any possible effects of our IV
What is meant by DEVIATION SCORES?
Concept of both error and variance depend on this notion - each SCORE in a set deviates from the GRAND MEAN by some amount
Important to distinguish here between sample mean and grand mean - grand mean is overall mean (mean of all sample means) while we have one sample mean for each level of our IV/condition
We hope that the sample means differ significantly from each other
What can we do with deviation scores?
Can divide them into two sections i.e. say we have a score of 11 and the grand mean is 6.5, so we say the total variance/deviation here is 4.5
But say we also have a sample mean of 9 - error variance(within-group) is the difference of 2 between the sample mean and score, while between-group deviation is the difference of 2.5 between the sample and grand means
In ANOVA analysis we have to find these 3 variances
What is the F-ratio statistic?
F=(variance between groups)/(variance within groups)
The denominator is an estimate of population variance using the average of variance within groups, each sample should provide an estimate of the population variance but we use the AVERAGE of all sample variances for our IV levels as a better estimate
Numerator estimates population variance from variance between group means using central limit theorem
If the null hypothesis is true and the samples are from same population i.e. no differences exist, F=1 as variances should be the same
What is effect size in ANOVA?
Proportion of total variation accounted for by the treatment - in the one-way ANOVA this is the sum of the squared deviations (SUM OF SQUARES) between groups as a proportion of the total sum of squares
SS(between) is also known as SS(effect)
SS(error) is SS(total)-SS(effect)
What are the effect size values commonly used for ANOVA
Small 0.1, medium 0.6, large 0.14
How do we calculate variance?
Sum of squares i.e. deviations of scores from grand mean squared and then the squares added up
Divided by degrees of freedom i.e. N-1
What is the ANOVA procedure?
Find SStotal by calculating sum of squares for scores and subtracting (sum of scores)squared and divided by N
Find SSbetween: square each sample mean, multiply it by the n for its group, and add the results. Then divide this by N
Find SSerror - simply the difference between the two previously calculated values
Find variance of each component i.e. MEAN SUM OF SQUARES (MS) i.e. average of squared deviations; divide each component SS by its df
F=MSbetween/MS error (F stats are written with df values in bracket, df for numerator i.e. between comes first and then dferror second)
What does a significant ANOVA result tell us?
Differences are present between means but we don’t know where those specific differences are - we can look at these differences using A PRIORI planned comparisons (essentially running a t-test because have a theoretical prediction for it) or POST-HOC comparisons which take into account probability of type 1 errors and provide analysis that doesn’t run too high a risk of producing chance differences
What are family-wise error rates and what is the simplest way to control for them?
Probability of making at least one type 1 error when making multiple tests on same data, assuming H0 true
Lower the alpha level - divide it by k (number of comparisons wish to make) e.g. if want to make 2 comparisons in a data set do 0.05/2 and use p value of 0.025
However t-test tables no longer give precise probabilities, so could use Bonferroni t-tests instead