Week 5: Comparing Means - One-way ANOVA Flashcards
What does ANOVA stand for?
Analysis of Variance
What
What is the decision tree for choosing a one-way ANOVA? - (5)
Q: What sort of measurement? A: Continuous
Q:How many predictor variables? A: One
Q: What type of predictor variable? A: Categorical
Q: How many levels of the categorical predictor? A: More than two
Q: Same or Different participants for each predictor level? A: Different
When does ANOVA be used?
if you are comparing more than 2 groups
Example of ANOVA RQ
Which is the fastest animal in a maze experiment - cats, dogs or rats?
We can’t do three separate t-tests for example what is the fastest animal in a maze experiment - cats, dogs or rats as
Doing separate t-tests inflates the type I error (false positive - e.g., pregnant man)
The repetition of the multiple tests adds multiple chances of error, which may result in a larger α error level than the pre-set α level - Family wise error
What is familywise or experimentwise error rate?
This error rate across statistical tests conducted on the same experimental data
Family wise error is related to
type 1 error
What is the alpha level probability
probability of making a wrong decision in accepting the alternate hypothesis = type 1 error
If we conduct 3 separate t-tests to test the comparison of which is the fastest animal in experiment - cats, dogs or rats with alpha level of 0.05 - (4)
- 5% of type 1 error of falsely rejecting H0
- Probability of no. of Type 1 errors is 95% for a single test
- However, for multiple tests the probability of type 1 error decreases as 3 tests together => 0.950.950.95 = 0.857
- This means probability of a type 1 error increases: 1- 0.857 = 0.143 (14.3% of not making a type 1 error)
Much like model for t-tests we can write a general linear model for
ANOVA - 3 levels of categorical variable with dummy variables
When we perform a t-test, we test the hypothesis that the two samples have the same
mean
ANOVA tells us whether three or more means are the same so tests H0 that
all group means are equal
An ANOVA produces an
F statistic or F ratio
The F ratio produced in ANOVA is similar to t-statistic in a way that it compares the
amount of systematic variance in data to the amount of unsystematic variance i.e., ratio of model to its error
ANOVA is an omnibus test which means it tests for and tells us - (2)
overall experimental effect
tells whether experimental manipulation was successful
An ANOVA is omnibus test and its F ratio does not provide specific informaiton about which
groups were affected due to experimental manipulation
Just like t-test can be represented by linear regression equation, ANOVA can be represented by a
multiple regression equation for three means and models acocunt for 3 levels of categorical variable with dummy variables
As compared to independent samples t-test that compares means of two groups, one-way ANOVA compares means of
3 or more independent groups
In one-way ANOVA we use … … to test assumption of equal variances across groups
Levene’s test
What is one-way ANOVA also called?
within-subject (all pps attend to every condition) ANOVA
What does this one-way ANOVA output show?
Leven’s test is non-significant so equal variances are assumed
What does this SPSS output show in one-way ANOVA?
F(2,42) = 5.94, p = 0.005, eta-squared = 0.22
How is effect size (eta-squared) calculated in one-way ANOVA?
Between groups sum of squares divided by total sum of squares
What is the eta-squared/effect size for this SPSS output and what does this value mean? - (2)
830.207/3763.632 = 0.22
22% of the variance in exam scores is accounted for by the model
Interpreting eta-squared, what does 0.01, 0.06 and 0.14 eta-sqaured means? - (3)
- 0.01 = small effect
- 0.06 = medium effect
- 0.14 = large effect
What happens if the Levene’s test is significant in the one-way ANOVA?
then use statistics in Welch or Brown-Forsythe test
The Welch or Brown-Forsythe test make adjustements to DF which affects
statistics you get and affect if p value is sig or not
What does this post-hoc table of Bonferroni tests show in one-way ANOVA ? - (3)
- Full sleep vs partial sleep, p = 1.00, not sig
- Full sleep vs no sleep , p = 0.007 so sig
- Partial sleep vs no sleep = p = 0.032 so sig
Diagram of example of grand mean
Mean of all scores regardless pp’s condition
What are the total sum of squares (SST) in one-way ANOVA?
difference of the participant’s score from the grand mean squared and summed over all participants
What is model sum of squares (SSM) in one-way ANOVA?
difference of the model score from the grand mean squared and summed over all participants
What is residual sum of squares (SSR) in one-way ANOVA?
difference of the participant’s score from the model score squared and summed over all participants
The residuals sum of squares (SSR) tells us how much of the variation cannot be
explained by the model and amount of variation caused by extraneous factors
We divide each sum of squares by its
DF to calculate them
For SST its DF we divide by is
N-1
For SSM its DF we divide by is
number of group (parameters), k,in model minus 1
For SSM if we have three groups then its DF will be
3-1 = 2
For SSR we divivde by its DF to calculate which will be the
total sample size, N, minus the number of groups, k
Formulas of dividing each sum of squares by its DF to calculate it - (3)
- MST = SST (N-1)
- MSR = SSR (N-k)
- MSM = SSM/k
SSM tells us the total variation that the
exp manipulation explains
What does MSM represent?
average amount of variation explained by the model (e.g. the systematic variation),
What does MSR represent?
average amount of variation explained by extraneous variables (the unsystematic variation).
The F ratio in one-way ANOVA can be calculated by
If F ratio in one-way ANOVA is less than 1 then it represents a
non-significant effect
Why F less than 1 in one-way ANOVA represents a non-significant effect?
F ratio is less than 1 means that MSR is greater than MSM = more unsystematic than systematic
If F is greater than 1 in one-way ANOVA then shows likelhood … but doesn’t tell us - (2)
indicates that experimental manipulation had some effect above and beyond effect of individual differences in performance
Does not tell us whether F-ratio is large enough to not be a chance result
When F statistic is large in one-way ANOVA then it tells us that the
MSM is greater than MSR
To discover if F statistic is large enough not to be a chance result in one-way ANOVA then
compare the obtained value of F against the maximum value we would expect to get by chance if the group means were equal in an F-distribution with the same degrees
of freedom
High values of F are rare by
chance
Large values of F are more common with studies with low number of
participants
The F-ratio tells us in one-way ANOVA whether model fitted to data accounts for more variation thane extraneous and does not tell us where
differences between groups lie
If F-ratio in one-way ANOVA is large enough to be statistically significant then we know
that one or more of the differences between means is statistically significant (e.g. either b2 or b1 i statistically significant)
It is necessary after conducting an one-way ANOVA to carry out further analysis to find out
which groups differ
The power of F statistic is relatively unaffected by
non-normality
when group sizes are not equal the accuracy of F is
affected by skew, and non-normality also affects the power of F in quite unpredictable ways
When group sizes are equal, the F statistic can be quite robust to
violations of normality
What tests do you do after performing a one-way ANOVA and finding significant F test? - (2)
- Planned contrasts
- Post-hoc tests
What do post-hoc tests do? - (2)
- compare all pairwise differences in mean
- Used if no specific hypotheses concerning differences has been made
What is the issue with post-hoc tests?
- because every pairwise combination is considered the type 1 error rate increases, so normally the type 1 error rate is reduced by modifying the critical value of p
Post-hoc tests are like two or one tailed hypothesis?
two-tailed
Planned contrasts are like one or two-tailed hypothesos?
One-tailed hypothesis
What is the most common modification of the critical value for p in post-hoc?
Bonferroni correction, which divides the standard critical value of p=0.05 by the number of pairwise comparisons performed
Planned contrasts are used to investigate a specific
hypothesis
Planned contrasts do not test for every
pairwise difference so are not penalized as heavily as post hoc tests that do test for every difference
With planned contrasts test you dervivie the hypotheses before the
data is collected
Diagram of planned contrasts
Contrast 1 = Treatment vs control
Contrast 2 = Treatment 1 vs Treatment 2
In planned contrasts when one condition is used it is
never used again
In planned contrasts the number of independent contrasts you can make can be defined with
k (number of groups) minus 1
How does planned contrasts work in SPSS?
Coefficients add to 0 for each contrast (-2 + 1 +1) and once group used alone in contrast then enxt contrasts set coefficient to 0 (e.g., -2 to 0)|
SPSS has a lot of contrasts that are inbult but helpful if
you know what these contrasts do before entering the data as depend on the order in which you coded your vairables
What are weights?
Values we assign to the dummy variables e.g., -2 in the box
One type of planned contrasts is a polynominal contrast which
tets for trends in data and in its most basic form looks for lienae treat (i.e., group means increase proportionately)
Polynominal contrasts can also look at more complex trends other than linear such as
quadratic, cubic and quartic
What does a linear trend represent?
simply proportionate change in the value of the dependent variable across ordered categories
What is a quadartic trend?
one change in the direction of the line (e.g. the line is curved in one place)
What is a cubic trend?
two changes in the direction of the trend
What is a quartic trend?
has three changes of direction
The Bonferroni post-hoc ensures that the type 1 error is below
0.05
With Bonferroni correction it reduces type 1 (being conserative in type 1 error for each comparison) it also
lacks statistical power (probability of type II error will be high [ false negative]) so increasing chance of missing a genuine difference in data
What post hoc-tests to use if you have equal sample sizes and confident that your group variances are similar?
Use REGWQ or Tukey as good power and tight control over Type 1 error rate
What post hoc tests to use if your sample sizes are slightly different?
Gabriel’s procedure because it has greater power,
What post-hoc tests to use if your sample sizes are very different?
if sample sizes are very different use
Hochberg’s GT2
What post-hoc test to run if Levene’s test of homeogenity of variance is significant?
Games-Howell
**
What post=hoc test to use if you want gurantee control over type 1 errror rate?
Bonferroni
What does this ANOVA error line graph show? - (2)
- Linear trend as dose of Viagra increases so does mean level of libido
- Error bars overlap indicating no between group differences
What does the within groups gives deails of in ANOVA table?
SSR (unsystematci variation)
The between groups label in ANOVA table tells us
SSM (systematic variation)
What does this ANOVA table demonstrate? - (2)
- Linear trend is significant (p = 0.008)
- Quadratic trend is not significant (p = 0.612)
When we do planned contrasts we arrange the weights in such that we compare any group with a positive weight
with a negative weight
What does this output show if we conduct two planned comparisons of:
one to test whether the control group was different to the two groups which received Viagra, and one to see
whether the two doses of Viagra made a difference to libido
- (2)
the table of weights shows that contrast 1 compares the placebo group against the two experimental groups,
contrast 2 compares the low-dose group to the high-dose group
What does this table show if levene’s test is non significant =equal variances assumed
To test hypothesis that experimental groups would increase libido above the levels seen in the placebo group (one-tailed)
To test another hypothesis that a high dose of Viagra would increase libido significantly more than a low dose
- (2)
for contrast 1, we can say that taking Viagra significantly increased libido compared to the control group (p = .0029/2 = 0.0145)
. The significance of contrast 2 tells us that a high dose of Viagra increased libido significantly more than a low dose (p(one-tailed) = .065/2 = .0325)
If making a few pairwise comparisons and equal umber of pps in each condition then … if making a lot then use. - (2).
Bonferroni
Tukey
a
Assumptions of ANOVA - (5)
- Independence of data
- DV is continuous; IV categorical (3 groups)
- No significant outliers;
- DV approximately normally distributed for each category of the IV
- Homogenity of variance = Levene’s test
Example of
ANOVA compares many means without increasing the chance of
type 1 error
In one-way ANOVA, we partiton the total variance into
IV and DV