Comparisons & Contrasts Flashcards
Type 1 error
claim ONE effect when isn’t ONE
Type 2 error
TO retain Null when there is an effect
Name the 2 types of type 1 error…
2 types of type 1 error:
1) Per comparison error rate (PC) – probability of making a type 1 error on any comparison
2) Familywise error rate (FW) – probability of making type 1 error in a family of comparisons
What is FW error rate if alpha=0.05 and 5 comparisons are made?
1 in 4 chance you get a type 1 error.
1 - (1 - .05)^5 =
what are the two possible ways of doing more specific analyses after an ombibus?
post hoc and a priori
what are the two myths about multiple comaprisons ?
1) Overall omnibus has to be significant–! Not true, not the end.
2) Using planned comparisons means you don’t need corrections against type 1 error inflation rate
if the overall F is sig then …
Ð Requiring overall significance will change the alpha level for FW errors, making the MC conservative (when overall F must be sig, then some subsequent tests become more conservative)
Ð MCs often tackles the actual hypothesis more directly
Methods for a priori comparisons?
1) Multiple t tests
2) Linear contrasts
3) Weighting coefficients for comparisons
two types of comparisons ?
orthogonal
non-orthogonal
multiple t test issues?
Simple but problematic in large number of comparisons and planned in advance
Needs homogeneity of variance, when there is a violation or unequal sample sizes, use Welch test
how does one Weighting coefficients for comparisons
?
Assign a weight to each mean. 0 for means left out of comparisons. At least need non-0.
Means to be contrasted are assigned opposite weights group 1 = -1 group 2 = +1
Weights must sum to 0.
E.G. > treatment vs placebo would be High (1/3) Med (1/3) Low (1/3) Placebo (-1) = 0
Always comparing two groups – can be comprised of many as long as sums to 0.
E.G. Comparing groups 1-3 and 4-5:
1 (1/3) 2 (1/3) 3 (1/3) 4 (-1/2) 5 (-1/2)
what is an orthogonal comparison?
Orthogonal = indepenedant of one another – can’t use same group twice. Can’t compare IV1 against IV2 and then IV1 against Iv3
Number of possible contrasts is K-1(df) e.g. 3 = 2 orthogonal contrasts //// 8 = 7 contrasts.
orthogonal comparison rules?
RULE: Has to be non-overlapping variance … i.e. has comparisons have to analyse non-overlapping variance. A set of contrasts that are mutually independent of one another. If a group is singled out in one comparison, it should not reappear in another comparison. Each contrast must compare only 2 chunks of variance.
First comparison = EXP vs control groups
Second Comparison = Within EXP or control groups
E.G contrast 1 =(Low and High Dose vs Placebo) contrast 2 = (low vs high dose)
If one is significant, it has no bearing on the rest.
If you want to do more, you need corrections to be made
what is an non-orthogonal comparison?
Contrasts NOT independent of each other
E.G
Contrast 1: group 1 vs group 2 (excluding group 3)
Contrast 2: group 1 vs group 3 (excluding group 2)
However, p-values may be correlated
corrections for non-orthogonal ?
BONFERRONI t
Boole’s inequality: The probability of occurrence of at least one of a set of events can never exceed the sum of their individual probabilities. Bonferroni set bounds on this inequality
Bonferroni corrects the alpha level based on the number of comparisons & evaluates t against Dunn’s table. So based on number of comparisons, the alpha level is reduced, say from .05 to .01. Alternative test is Dunn-Sidak test, which is a variation on Bonferroni but super similar – no clear benefit as so similar. Dunn- Sidak has greater stat power in large numbers.
Multistage Bonferroni = Holm procedure – for controlling FW error rates for multiple hypothesis.
best not to quibble over tiny variations in tests and just….
JUST GET GOOD POWER TO BEGIN WITH
what is the FALSE DISCOVERY RATE (FDR)?
FDR is a recent alternative for controlling FW error rates (I think mostly used in cog neuro in large number of analyses)
what does the FDR do?
– It controls the expected proportion of falsely rejected hypotheses (Type I errors) among the list of rejected null hypotheses– looks at all the null rejected, and comes up with estimate of how many should be rejected or not.
Details of the FDR?
More liberal than Bon, better tuned to the data – little bit more liberal
– If all tested null hypotheses are true, controlling the FDR controls the traditional FW error rate
– When many of the tested null hypotheses are rejected, it is preferable to control the proportion of errors rather than the probability of making even one error
– We can bear more errors when many null hypotheses are rejected, but can tolerate fewer errors when fewer nulls are are rejected
you should try…. POST HOC COMPARISONS
Try to avoid whenever possible by having an idea a priori. More powerful than Bonferroni t-test. But never use purely because more liberal.
post hoc comps have a trade off?
There is a trade-off between controlling the family-wise error rate and loss of statistical power.
Ð Because a stricter condition (lower alpha) reduces type-1’s
Ð Too conservative would lead to type-2’s
three guiding qu of post hoc comps?
Three guiding questions:
1) Does test control for type 1?
2) Does test control for type 2?
3) Is the test reliable when assumptions of MANOVA (Normality / homo var / unequal sample sizes) have been violated?
types of post hoc comps?
Fisher’s least significant difference test (LSD) – functionally similar to multiple t tests. – requires overall ANOVA to be sig.
Studentised Newman-keuls test – good power / lacks confidence interval
TUKEY’s honestly significant difference – (HSD) test –
Ð Possibly the safest test for multiple pairwise comparisons yet keeping the familywise error rate down
Ð conservative (weak statistical power)
Ð More powerful than Bonferroni for larger number of comparisons but less powerful for a smaller number
Scheffe test
• Unlike many post hoc tests, not restricted to pairwise comparisons
• Valid for any (unplanned) comparison as long as expressible in contrasts (most flexible) • Very low statistical power
Ryan, Einot, Gabriel, and Welsch Q procedure (REGWQ) • Stronger statistical power
• Tighter control over Type I error rate
• Only suitable for equal sample sizes
post hoc in general
- Relatively robust against non-normality
- Perform poorly with unequal group sizes
- Perform poorly when population variances are different
- For equal sample sizes and equal population variances: • Tukey’s HSD test
- Bonferroni to guarantee control over Type I error
- For strongly unequal sample sizes • Hochberg’s GT2
- For unequal variances
- Games-Howell procedure
- Don’t choose based on the outcome!
What is trend analysis ?
better for repeated measures / looking at trends over time(like time series)