Lecture 1 - ANOVA Flashcards
What is ANOVA?
A type of statistical model used to estimate the differences between means.
(ANOVA = Analysis of Variance)
- ANOVA statistic: F-statistic -> determines whether the difference between ANY TWO MEANS (doesn’t specify which ones) is significant (we’ll come back to this point later on again)
Why would we use ANOVA and not a t-test?
A t-test is used when we have two conditions of the Independent variable (IV), whereas ANOVA is used when the conditions are more than 2
Experiment example
Experimenters wanted to test the effect of puppy-treatment on happiness (IV: puppy-treatment, DV: stress). The IV has 3 conditions, 0minutes, 15minutes and 30minutes. They took a sample of 15 participants and conducted their research.
(This example is used throughout the book constantly to explain the concepts, so I’ll use it as well to give examples throughout the whole chapter, and I’ll do the same with the next chapters as well)
Comparing means using the Linear model
What are the two general methods to compare means?
- Variance-ratio method: good for simple situations, but not good for complex ones (just mentioned, don’t need to remember anything else)
- LINEAR MODEL: good for complex situations
What is the equation for the general linear model?
outcome(i) = model + error(i)
In the case of the puppy-treatment example:
happiness = bo(^) + b1(^)group(i) + e(i).
(For a better image of both, look at slide 1, Equations 1&2)
- This model states that we can estimate happiness from group membership and bo(^).
- ~ bo(^): amount of happiness group membership value is 0, in our case, when Dose of puppies is 0 minutes
- ~ b1(^): correlation between Happiness and Dose, in our case how the 15min condition or the 30min condition affect happiness
Dummy coding
Dummy coding
What is dummy coding?
When we use 0’s and 1’s to represent categories
Dummy coding
What is the process of dummy coding?
1) Count the number of groups you have and subtract 1 from them (0min, 15min, 30min: 3 groups, 3-1 = 2)
2) The amount of groups we have now (2) are equal to the number of dummy variables we’ll use (later steps)
3) Choose one baseline group to which all the other groups will be compared (This group is usally the control group)
- NOTE: if different conditions have unequal sample sizes, baseline must contain a large number of cases to ensure that b-estimates are reliable
4) Our baseline group is the 0min group. Assign 0 to all variables for this group (what this means is also explained more at the end of the flashcard)
5) 1st Dummy variable: assign 1 to one group, 0 to the other (E.g. 15min group, Variable 1 (LONG) = 0, Variable 2 (SHORT) = 1)
6) 2nd dummy variable: same process but the opposite (30min group, Variable 1 (LONG) = 1, Variable 2 (SHORT) = 0)
(Look at slide 2: There are two dummy variables, Dummy short and Dummy long. Each person has their group membership coded in terms of 0’s and 1’s. Somebody in the 0min condition, has 0 on both Dummy Variables. Somebody in the 15min condition, has 0 on the LONG variable and 1 on the SHORT variable. Someone in the 30min condition has 1 on the LONG variable and 0 on the SHORT variable)
Dummy coding
Based on dummy coding, how can we now re-write the linear model?
(Look at Slide 1, Equation 3)
(NOTE: There are two equations under Equation 3, the bottom one is missing the error term and has a hat (estimation hat) on top of happiness. When we try and predict (estimate) happiness through the model, we don’t actually collect data, so there can’t be any error, since we don’t have data. That’s why we omit the error term)
Dummy coding
Based on this dummy coding, how can we find the values of the b-estimates?
- If I’m in the 0min group: (Equation 4)
- If I’m in the 30min group: (Equation 5)
- If I’m in the 15min group: (Equation 6)
F-Statistic
F-statistic
What does a significant F-Statistic tell us?
A significant F-Statistic tells us that the group means are signficantly different. It also gives us an overall fit of a linear model to a set of observed data:
- F = (how good the model is)/ (how bad the model is)
- F = (explained variation)/ (unexplained variation)
(2 different general equations for the F-statistic)
!!!!!!! DOESN’T TELL US the difference between specific groups, or how large that difference is. It just tells us that at least any two groups differ signficantly from each other !!!!!!!
F-Statistic
((How do we determine if an F-statistic is significant or not?))
[not needed as a question necessarily]
E.g. assume we have an F = 5.12. To see if this F is significant we will compare it to the critical value for an F-distribution with the same degrees of freedom
- For df = 2, the critical value of F is 3.89 (where p = 0.05)
- For df = 12, the critical value of F is 6.93 (where p =0.01)
So F = 5.12 is critical at the 0.05 level, but not at the 0.01 level
(Not much was expanded on this, no need to fully understand or read it. If you want a better explanation just ask me because it’s a but more complex and not even mentioned in the book)
F-Statistic
What can tell us about the differences between specific means?
B(^)-values:
- e.g. say the mean happiness of the 0min group is 2.2, and the mean happiness of the 30min group is 5. The difference of means is 5 - 2.2 = 2.8. The value of 2.8 is a t-statistic with a p-value of 0.008<0.05, so the difference between the 0min group and the 30min group is significant
- Say the mean happiness of the 15min group is 3.2. Difference between 0min and 15min group: 3.2-2.2 = 1, p-value of t = 1 is 0.282>0.05, so the difference between the 0min group and the 15min group is NOT significant
F-Statistic
How can we apply all the above in Hypothesis Testing?
- Ho model: IV has no effect on the DV, the predicted value will always be that of the grand mean (always the same value)
- Ha model: IV has an effect on the DV, described by the equations mentioned previously
- ~ The bigger the coefficients (b1(^) and b2(^)) the greater teh deviation of this model from the Ho
- ~ If the differences between the groups are large enough (if b1(^) and b2(^) are large enough) then the Ha model is a better fit to the data than the Ho
Sum of Squares
Sum of Squares
What is the total sum of squares?
The total variance between each participant and the grand mean (the grand mean is the total mena of all participants’ scores, or the mean of all group means)
(See Equation 7)
- Used to find the total amount of variation within our results
- df = total sample size - 1 = 14
Sum of Squares
What is the Model Sum of Squares (SSm)?
The amount of variation observed that the model can explain (how much of the variation can be explained by the fact that different scores coem from different treatment conditions)
(See Equation 8)
- Differences between values predicted by the model (group means) and the grand mean
- df = amount of groups - 1 = 3 - 1 =2
Sum of Squares
What is the Residual Sum of Squares (SSr)?
The amount of variation that CAN’T be explained by the model
(See Equation 9)
More specifically, it represents how much a model deviates from the observed data.
- Differences between each point and the group to which that point corresponds
- df = dfT(total) - dfM(model) = 12
!!! See Slide 4 for another explanation of what difference each type of sum of squares represents !!!
Sum of Squares
What are mean squares?
Since SSr and SSm are both sums, their size depends on the number of scores (SSm: used number of 3 groups, SSr used the sum of 15 participants). To eliminate this inequality we calculate the average mean of squares, which are basically standardized sum of squares (so that they both have the same size approximately)
- MSm: SSm/dfm -> average amount of variation explained by the model
- MSr: SSr/dfr -. average amount of variation not explained by the model
Sum of Squares
How can you represent the F-Statistic in terms of mean sum of squares?
F-Statistic = explained variation/unexplained variation
So, F = MSm/MSr
Implications of this:
- If F < 1, then unsystematic variance (SSr)>systematic variance (SSm)
- If F>1, then unsystematic variance (SSr)<systematic variance (SSm)
Assumptions
Assumptions
What are the 3 assumptions when conducting an ANOVA?
- Independence
- Normality
- Homoscedasticity (Homogeneity of variances)