Lecture 41- ANOVA Flashcards
When is an ANOVA used?
Comparing means on continuous responses between multiple groups (more than 3)
What is ANOVA doing in terms of the signal to noise ratio?
Signal= difference between treatment means (each group) Noise= The difference within each of the treatment categories (within the group)
ANOVA quantifies both of these levels of variability
Why is pairwise comparisons (t-tests) for multiple groups unfavorable?
- It’s extra work (need to do for every pair combination)
- It can lead to lots of false positives. With every test there is a chance of incorrectly rejecting the null. Therefore, risk increases the more tests we do.
How is the notation set up for ANOVA?
Y(ij) means the jth response in the ith group
note: ij is in lower case
The number of different groups is denoted K, and the number of
responses in the ith group is denoted ni
What is the model for ANOVA?
Y(ij)= ui +eij
µi is the true mean response for the ith group at the population level.
eij is the error term for the jth response in the ith group
What is ‘special’ about the error term in the ANOVA model?
The error terms are assumed to be independent, and to follow a
N(0, σ2) with constant variance.
What is RSS?
A measure of the variation in the data that is not explained by
differences between groups.
What is TSS?
TSS is a measure of the total amount of variation in the data.
What is GSS?
can be interpreted as a measure of the variation that
is explained by differences between groups.
This is the same as ESS
What is the null hypothesis for ANOVA? What is the alternative hypothesis?
Null means there is no difference between the groups
H0 : µ1 = µ2 = · · · = µK
Alternative means there is a difference between the groups (note: not all the means have to be different it just implies the null is not true)
HA : µ1, µ2, . . . , µK not all equal
What do we expect of the GSS when group means are very different as opposed to when group means are very similar?
From the previous discussion, we expect GSS to be relatively large
when the group means are very different.
We expect GSS to be relatively small when then group means are
similar.
Graphically what would a significant difference in group means look like?
GSS (green) would be large compared to RSS (red)
If there was not a significant difference the lines would be equal in length
How do you calculate the F statistic for the hypothesis test invovled in an ANOVA?
Refer to slide 798 but basically…
the group mean square/ residual mean square
How are the various sum of squares and related quantities frequently displayed?
An ANOVA table
Using an ANOVA table what gives you the F statistic required for a hypothesis test?
GMS/ RMS