Lecture 10: ANOVA Flashcards
what does ANOVA stand for
Analysis of variance
What is an F-distribution
A continuous probability distribution, most frequently used as a null distribution of the test statistic in analysis of variance
What makes ANOVA different from independent t-test
ANOVA allows for more than two groups, independent t-test only compares two groups
What is a grand mean
The best guess of 1 value that summarizes the data
What is the total sum of squares (SS)
All distances from a data point to the grand mean added together
What happens to the total sum of squares and accuracy when you add a parameter
The SS goes down and your accuracy gets better
What are the 4 assumptions of a one-way independent ANOVA
- Continuous variable
- Random sample
- Normally distributed (test with Shapiro-Wilk test or Q-Q plots)
- Equal variance within groups (test with Levene’s test)
What is the formula for the F-ratio
MSmodel/MSerror
What does the F-ratio quantify
How much better your model is at predicting the data than the null is
What is an implication considering the mean squares of model and mean squares of error when the F-ratio is 1
They are the same
What happens to the mean squares of model/error when the F-ratio increases
If the F-ratio is 5 then the mean of squares of model is 5 times the mean of squares of error
T/F: F can not be smaller than 0
True
Why can F not be smaller than 0
Because MSmodel and MSerror are always positive
T/F: F-test is always two sided
True
What are contrasts
They are planned comparisons between groups
What is important when using contrasts and what are 2 positive things about them
The values you assign to them always have to add up to 0
- They have higher precision
- They have higher power
How can the total sum of squares be divided
Model sum of squares and error sum of squares
What is another word for model sum of squares? And error sum of squares?
Model accuracy/model error
What does it mean for the model prediction if observed difference between groups is 0
It means that the model is not better at predicting than the grand mean, because the groups means are the same which means they are the same as the grand mean
What is the error sum of squares
The distances from the data points in one group to the mean of that group, and the distances from the data points in the other group to the mean of that group (if more groups, same procedure)
What is the model sum of squares
It is the distance between the grand mean and the group means = how much less error their is when predicting with the group means
What two things does the F-distribution depend on
- Sample size; DFerror=N-k
- Number of groups; DFmodel=k-1