anova Flashcards
when might we use a one-way ANOVA Examples:
- Is there a difference in the average spend per patient when treating different levels of depression severity (mild, moderate, or severe)?
- Does smoking history (non smoker, light smoker, heavy smoker) affect how far you can run?
- Is there a difference amongst different age groups (children, adolescent, adults and older adults) in how long they attend to visual stimuli?
Features of the data for one way ANOVA
One independent variable with three (or more) groups (conditions/levels)
One dependent variable measured using normally distributed data
What is A one-way way anova?
ANOVA: ANalysis Of VAriance
A statistical technique that compares the variance within samples and the variance between samples in order to estimate the significance of differences between a set of means.
- One independent variable
A one-way ANOVA is used to examine the difference amongst three or more groups on one continuous, normally distributed variable.
A one-way ANOVA examines the variance between these groups, whilst also controlling for the variance within groups.
factors
Independent variable(s) e.g. depression severity
levels/conditions
Categories in each factor e.g. mild, moderate or severe
effects
Quantitative measure indicating the difference between levels/conditions.
Type 1 error
Incorrectly rejecting the null hypothesis (false positive)
In other words:
Getting a significant result when there’s no real effect in the population
More likely with:
Multiple Comparisons
e.g. Doctors find a significant effect of the drug on pain relief. When in fact the true effect is that the drug does not relieve pain. The null hypothesis is incorrectly rejected.
type 2 error
Incorrectly failing to reject a false null hypothesis (false negative)
In other words:
Not finding a significant effect when there actually is one in the population
More likely with Small sample sizes
e.g. Type 2 Error:
Doctors find no significant effect of the drug on pain relief. When in fact the true effect is that the drug does relieve pain. The null hypothesis is incorrectly accepted.
Multiple Comparisons – Why is multiple testing a problem?
● If we adopt an α level of 0.05, then assuming the null hypothesis (H0 is true, then 5% of the statistical tests would show a significant difference or association.
● The more tests we run, the greater likelihood that at least 1 of those tests will be significant by chance (Type 1 error).
When can issues surrounding multiple testing arise?
● Looking for differences amongst groups on a number of outcome measures
● Analysing your data before data-collection has finished (and then re-analysing it at the end of data collection).
○ This violation is often used to see whether more data need to be collected to reach significance.
● Unplanned analyses i.e. conducting additional analyses to try and find something of interest…
How to address issues of multiple testing
- Avoid over-testing (plan your analyses in advance)
- Use appropriate tests
- In instances where multiple tests are run, adjust the α threshold (more on this next week).
Multiple comparisons vs ANOVA
● Multiple comparisons runs the risk of a Type 1 error (false positive)
● ANOVA allows us to examine the differences between multiple groups as whole, rather than running lots of different tests.
○ If we find a significant effect in our ANOVA, we can then do further investigations to find out where the difference lies with adjustments for multiple comparisons (more on this next week).
f statistic
● F Statistic is essentially a ratio of two variances, these are also referred to as mean squares.
● Mean squares are variances that account for the degrees of freedom
● The F statistic, along with degrees of freedom are used to then calculate the p-value
low f statistic
The group means cluster together more tightly than the within group variability
low f statistic
The group means spread out more than the variability within groups.