Lecture 1 - ANOVA Flashcards
What is ANOVA?
A type of statistical model used to estimate the differences between means.
(ANOVA = Analysis of Variance)
- ANOVA statistic: F-statistic -> determines whether the difference between ANY TWO MEANS (doesn’t specify which ones) is significant (we’ll come back to this point later on again)
Why would we use ANOVA and not a t-test?
A t-test is used when we have two conditions of the Independent variable (IV), whereas ANOVA is used when the conditions are more than 2
Experiment example
Experimenters wanted to test the effect of puppy-treatment on happiness (IV: puppy-treatment, DV: stress). The IV has 3 conditions, 0minutes, 15minutes and 30minutes. They took a sample of 15 participants and conducted their research.
(This example is used throughout the book constantly to explain the concepts, so I’ll use it as well to give examples throughout the whole chapter, and I’ll do the same with the next chapters as well)
Comparing means using the Linear model
What are the two general methods to compare means?
- Variance-ratio method: good for simple situations, but not good for complex ones (just mentioned, don’t need to remember anything else)
- LINEAR MODEL: good for complex situations
What is the equation for the general linear model?
outcome(i) = model + error(i)
In the case of the puppy-treatment example:
happiness = bo(^) + b1(^)group(i) + e(i).
(For a better image of both, look at slide 1, Equations 1&2)
- This model states that we can estimate happiness from group membership and bo(^).
- ~ bo(^): amount of happiness group membership value is 0, in our case, when Dose of puppies is 0 minutes
- ~ b1(^): correlation between Happiness and Dose, in our case how the 15min condition or the 30min condition affect happiness
Dummy coding
Dummy coding
What is dummy coding?
When we use 0’s and 1’s to represent categories
Dummy coding
What is the process of dummy coding?
1) Count the number of groups you have and subtract 1 from them (0min, 15min, 30min: 3 groups, 3-1 = 2)
2) The amount of groups we have now (2) are equal to the number of dummy variables we’ll use (later steps)
3) Choose one baseline group to which all the other groups will be compared (This group is usally the control group)
- NOTE: if different conditions have unequal sample sizes, baseline must contain a large number of cases to ensure that b-estimates are reliable
4) Our baseline group is the 0min group. Assign 0 to all variables for this group (what this means is also explained more at the end of the flashcard)
5) 1st Dummy variable: assign 1 to one group, 0 to the other (E.g. 15min group, Variable 1 (LONG) = 0, Variable 2 (SHORT) = 1)
6) 2nd dummy variable: same process but the opposite (30min group, Variable 1 (LONG) = 1, Variable 2 (SHORT) = 0)
(Look at slide 2: There are two dummy variables, Dummy short and Dummy long. Each person has their group membership coded in terms of 0’s and 1’s. Somebody in the 0min condition, has 0 on both Dummy Variables. Somebody in the 15min condition, has 0 on the LONG variable and 1 on the SHORT variable. Someone in the 30min condition has 1 on the LONG variable and 0 on the SHORT variable)
Dummy coding
Based on dummy coding, how can we now re-write the linear model?
(Look at Slide 1, Equation 3)
(NOTE: There are two equations under Equation 3, the bottom one is missing the error term and has a hat (estimation hat) on top of happiness. When we try and predict (estimate) happiness through the model, we don’t actually collect data, so there can’t be any error, since we don’t have data. That’s why we omit the error term)
Dummy coding
Based on this dummy coding, how can we find the values of the b-estimates?
- If I’m in the 0min group: (Equation 4)
- If I’m in the 30min group: (Equation 5)
- If I’m in the 15min group: (Equation 6)
F-Statistic
F-statistic
What does a significant F-Statistic tell us?
A significant F-Statistic tells us that the group means are signficantly different. It also gives us an overall fit of a linear model to a set of observed data:
- F = (how good the model is)/ (how bad the model is)
- F = (explained variation)/ (unexplained variation)
(2 different general equations for the F-statistic)
!!!!!!! DOESN’T TELL US the difference between specific groups, or how large that difference is. It just tells us that at least any two groups differ signficantly from each other !!!!!!!
F-Statistic
((How do we determine if an F-statistic is significant or not?))
[not needed as a question necessarily]
E.g. assume we have an F = 5.12. To see if this F is significant we will compare it to the critical value for an F-distribution with the same degrees of freedom
- For df = 2, the critical value of F is 3.89 (where p = 0.05)
- For df = 12, the critical value of F is 6.93 (where p =0.01)
So F = 5.12 is critical at the 0.05 level, but not at the 0.01 level
(Not much was expanded on this, no need to fully understand or read it. If you want a better explanation just ask me because it’s a but more complex and not even mentioned in the book)
F-Statistic
What can tell us about the differences between specific means?
B(^)-values:
- e.g. say the mean happiness of the 0min group is 2.2, and the mean happiness of the 30min group is 5. The difference of means is 5 - 2.2 = 2.8. The value of 2.8 is a t-statistic with a p-value of 0.008<0.05, so the difference between the 0min group and the 30min group is significant
- Say the mean happiness of the 15min group is 3.2. Difference between 0min and 15min group: 3.2-2.2 = 1, p-value of t = 1 is 0.282>0.05, so the difference between the 0min group and the 15min group is NOT significant
F-Statistic
How can we apply all the above in Hypothesis Testing?
- Ho model: IV has no effect on the DV, the predicted value will always be that of the grand mean (always the same value)
- Ha model: IV has an effect on the DV, described by the equations mentioned previously
- ~ The bigger the coefficients (b1(^) and b2(^)) the greater teh deviation of this model from the Ho
- ~ If the differences between the groups are large enough (if b1(^) and b2(^) are large enough) then the Ha model is a better fit to the data than the Ho