One-way ANOVA Flashcards
When we have multiple levels of the IV, why can’t we simply use multiple t-tests?
For example, say we have 4 IV levels
We would need to run 6 t-tests to compare all means two at a time
This increases the probability of TYPE 1 errors i.e. finding an effect when the null is actually true
Remember that each test has a 5% probability of getting a significant result and if we conduct several such tests on a single data set this probability increases
When would we use a one-way ANOVA?
Independent design, testing for differences between 3 or more conditions of one independent variable
Requires interval data and requires fulfilment of all other parametric assumptions
What are the null and alternative hypothesis?
NULL - populations from which samples randomly drawn have equal means
ALTERNATIVE - means do not equal each other
How do ANOVAs differ from t-tests?
t-tests use DIFFERENCES BETWEEN SAMPLE MEANS but ANOVA uses variance among the scores contributing to those means - if variance within a group is very low compared with difference between means (variance of the means) we can be more confident that differences among scores are caused by the different levels of IV
What do ANOVAs do?
Compare variance BETWEEN groups with variance WITHIN groups - if there is a significant effect we expect within-group variation to be small compared to between-group i.e. variation of sample means
What is within-group variance thought to be attributed to?
Individual differences and random factors rather than the IV - this is referred to as ERROR
We want this variance to be minimal so as not to cloud any possible effects of our IV
What is meant by DEVIATION SCORES?
Concept of both error and variance depend on this notion - each SCORE in a set deviates from the GRAND MEAN by some amount
Important to distinguish here between sample mean and grand mean - grand mean is overall mean (mean of all sample means) while we have one sample mean for each level of our IV/condition
We hope that the sample means differ significantly from each other
What can we do with deviation scores?
Can divide them into two sections i.e. say we have a score of 11 and the grand mean is 6.5, so we say the total variance/deviation here is 4.5
But say we also have a sample mean of 9 - error variance(within-group) is the difference of 2 between the sample mean and score, while between-group deviation is the difference of 2.5 between the sample and grand means
In ANOVA analysis we have to find these 3 variances
What is the F-ratio statistic?
F=(variance between groups)/(variance within groups)
The denominator is an estimate of population variance using the average of variance within groups, each sample should provide an estimate of the population variance but we use the AVERAGE of all sample variances for our IV levels as a better estimate
Numerator estimates population variance from variance between group means using central limit theorem
If the null hypothesis is true and the samples are from same population i.e. no differences exist, F=1 as variances should be the same
What is effect size in ANOVA?
Proportion of total variation accounted for by the treatment - in the one-way ANOVA this is the sum of the squared deviations (SUM OF SQUARES) between groups as a proportion of the total sum of squares
SS(between) is also known as SS(effect)
SS(error) is SS(total)-SS(effect)
What are the effect size values commonly used for ANOVA
Small 0.1, medium 0.6, large 0.14
How do we calculate variance?
Sum of squares i.e. deviations of scores from grand mean squared and then the squares added up
Divided by degrees of freedom i.e. N-1
What is the ANOVA procedure?
Find SStotal by calculating sum of squares for scores and subtracting (sum of scores)squared and divided by N
Find SSbetween: square each sample mean, multiply it by the n for its group, and add the results. Then divide this by N
Find SSerror - simply the difference between the two previously calculated values
Find variance of each component i.e. MEAN SUM OF SQUARES (MS) i.e. average of squared deviations; divide each component SS by its df
F=MSbetween/MS error (F stats are written with df values in bracket, df for numerator i.e. between comes first and then dferror second)
What does a significant ANOVA result tell us?
Differences are present between means but we don’t know where those specific differences are - we can look at these differences using A PRIORI planned comparisons (essentially running a t-test because have a theoretical prediction for it) or POST-HOC comparisons which take into account probability of type 1 errors and provide analysis that doesn’t run too high a risk of producing chance differences
What are family-wise error rates and what is the simplest way to control for them?
Probability of making at least one type 1 error when making multiple tests on same data, assuming H0 true
Lower the alpha level - divide it by k (number of comparisons wish to make) e.g. if want to make 2 comparisons in a data set do 0.05/2 and use p value of 0.025
However t-test tables no longer give precise probabilities, so could use Bonferroni t-tests instead
What is a Bonferroni t-test?
Alternative to lowering the alpha level when only making a few comparisons
Use overall error from ANOVA (calculated as the square root of sum of MSerrors divided by sample size for the two samples being compared) - remember MSerror is SSerror/dferror
This is the denominator of our equation - the nominator is the difference between two means being compared i.e. means of two of the samples we are looking at
Also use dferror when looking at critical value tables
What are a priori comparisons?
Before gathering data we might be able to predict on theoretical grounds what differences we expect to see
In this case, after checking the ANOVA result, we would want to test the two means we suspect to be the root of a difference
There is a risk of making a type 1 error with this but it is higher with post hoc tests because they test ALL possible combinations of means
What are post-hoc comparisons?
Made after inspecting the ANOVA result - we might decide to make ALL possible tests between pairs of means and each that comes out as significant we take as indicating that underlying population means are different
What are 3 possible post-hoc tests?
1) Newman-Keuls - error rate can get high where we have many conditions
2) Tukeya/Tukeyb - honestly significant vs wholly significant tests; “a” test is considered safest keeping error rate down to 0.05
3) Scheffe test - even more conservative, keeping error rate at 0.05 or less