Week 7 - planned comparision and post hoc tests Flashcards

1
Q

Why does the F ratio not paint the whole picture?

A

only tells there is a difference somewhere between the means We need an analysis that helps to determine where the difference(s) are

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the two basic approaches to comparisions?

A
  • A priori (or planned) comparisons
  • Post hoc comparisons
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is a priori (or planned) comparisons

A
  • If we have a strong theoretical interest in certain groups and have evidence-based specific hypothesis regarding these groups then we can test these differences upfront
  • Come up with these before you do your study
  • Seek to compare only groups of interest
  • No real need to do the overall ANOVA we do it because of tradition. Hence, reports often start with the F test and progress to planned comparisons

rather be in a prior hypothesis then be in post hoc

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is a Post hoc comparisons?

A
  • If you cannot predict exactly which means will differ then you should do the overall ANOVA first to see if the IV has an effect, then
  • Post hoc comparisons (post hoc = after the fact/ANOVA)
  • seek to compare all groups to each other to explore differences.
  • Less refined – more exploratory.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are the two types of A priori/ Planned comparisons

A

Simple

Complex

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is a simple a priori comparison?

A

comparing one group to just one other group

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is a complex a priori comparison?

A

comparing a set of groups to another set of groups

*In SPSS we create complex comparisons by
assigning weights to different groups

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How to conduct a priori comparison (how to weight it)

A

Create 2 sets of weights

  • 1 for the first set of means
  • 1 for the second set of means
  • Assign a weight of zero to any remaining groups
  • Set 1 gets positive weights
  • Set 2 gets negative weights
  • They must sum to 0

A simple rule that always works –> The weight for each group is equal to the number of groups in the other set

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are the assumptions of a priori/ planned comparisons

A
  • Planned comparisons are subject to the same assumptions as the overall ANOVA - particularly homogeneity of variance as we use pooled error term.
  • Fortunately, when SPSS runs the t-tests for our contrasts it gives us the output for homogeneity assumed and homogeneity not assumed
  • If homogeneity is not assumed SPSS adjusts the df of our F critical to control for any
    inflation of TYPE 1error
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are Orthogonal contrasts

A
  • One particularly useful kind of contrast analysis is where each of the contrasts tests something completely different to the other contrasts

Principle:
Once you have compared one group (e.g., A) with another (e.g., B) you don’t compare
them again.

Example
Groups 1,2,3,4
Contrast 1 = 1,2 vs 3,4
Contrast 2 = 1 vs 2
Contrast 3 = 3 vs 4

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Cool things about orthogonal contrasts

A
  • A set of k-1 orthogonal contrasts (where k is the number of groups) accounts for all of the differences between groups
  • According to some authors a set of k-1 planned contrasts can be performed without adjusting for type-I error rate
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Post-Hoc comparisons

A
  • Let’s say we had good reason to believe that sleep deprivation would impact performance but did not know at exactly what level of sleep deprivation this would occur. So, we had no specific hypothesis about what difference would emerge between which conditions.
  • In this case, planned comparisons would not be appropriate
  • Here you would perform the overall F analysis first
  • If overall F is significant, we need to perform post-hoc tests to determine where the differences actually are
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What do post hoc comparisions seek to compare?

A

Post-hoc tests seek to compare all possible combinations of
means
* This will lead to many pair-wise comparisons
* e.g., With 4 groups, 6 comparisons
* 1v2, 1v3, 1v4, 2v3, 2v4, 3v4

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How does post hoc comparisions increase the risk of type 1 errors?

A
  • So, as we know when we find a significant difference there is an alpha chance that we have made a Type I error.
  • The more tests we conduct the greater the Type I error rate
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the error rate per experiment (PE)

A

the total number of Type 1 errors we are likely to make in conducting all the tests required in our experiment.
* The PE error rate <=  x number of tests
* <= means it could be as high as that value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How do you restore a type 1 error rate back to .05 (5%) when conducting multiple tests

A

So when we need to conduct several tests, what should we do about the rising Type I error rate?
* If many tests is required, then a Bonferroni Adjusted
apha level may be used

17
Q

What is a Bonferroni adjustment?

A
  • Divide  by the number of tests to be conducted (e.g., .05/2 = .025 if 2 tests are to be conducted).
  • Assess each follow up test using this new  level (i.e. .025)
  • Maintains PE error at .05 But this will reduce the power of your comparisons a lot!

Remember as we decrease alpha (by making our test more conservative) we also decrease power (chances of detecting a true effect)

18
Q

What are alternatives to the Bonferroni test (alternatives to reducing type 1 error rate)

A
  • There are several statistical tests that systematically compare all means whilst controlling for Type 1 error
  • LSD - least significant difference (actually no adjustment) (where you just ignore the problem -> not recommended)
  • Tukey’s HSD - Honestly Significant Difference, popular as best balance between control of EW error rate and power (ie Type 1 V Type 2 error)
  • Newman-Keuls: gives more power but less stringent control of EW error rate
  • Scheffe Test most stringent control of EW error rate as controls for all possible simple and complex contrasts
  • And many others you can find out about at your leisure
19
Q

What is the best one of these tests to use?

A

Tukey’s test is very common and recommended.

20
Q

What to do with post hoc tests (when do you use them and how)

A
  • If your hypothesis predicts specific differences between means;
  • Assess assumptions
  • Perform ANOVA
  • Consider what comparisons will test your specific hypotheses
  • Perform planned comparisons needed to test these predictions
  • If your hypothesis does not predict specific differences between means;
  • Assess assumptions
  • Perform ANOVA
  • If ANOVA is significant then perform post-hoc tests
  • If ANOVA is not significant then don’t do post-hoc tests
21
Q

What is a meta analysis?

A

When a researcher finds a lot of papers in the literature about a specific topic then you take their individual statistics and put it in a spreadsheet and then you aggregates these statistics and do a statistical test on this aggregated data

22
Q

Effect size philosophy

A

A significant F simply tells us that there is a difference between means. I.e., that the IV has had some effect on the DV
* It does not tell us how big this difference is.

  • It does not tell us how important this effect is.
  • An F significant at .01 does not necessarily imply a bigger or more important effect than an F significant at .05.
  • The significance of F is dependent on the sample size and the number of conditions which determines the F comparison distribution
23
Q

What does effect size tell us?

A

If I took the overall variability in my criterion variable (example here target accuracy) How much of that variability could I explain on the basis of how much sleep deprivation you’ve had?

summarizes the strength of the treatment effect:

  • Eta squared (n2)
  • Indicates the proportion of the total variability in the data accounted for by the effect of the IV.
24
Q

what does n2 tell us

A
  • This result says that __% of the variability in errors is due to the effect of manipulating whatever our IV is

For example, one could say that 65% of the variability in errors is due to the effect of manipulating sleep deprivation.

25
Q

What are the limitations of n2 (eta squared) given by SPSS

A
  • It is a descriptive statistic not an inferential statistic so not the best indicator of the effect size in population
  • It tends to be an overestimate of the effect size in the population
26
Q

Criteria for assessing eta squared

A

Cohen (1977) proposed the following scale for effect size;
* .01 = small effect (1%)
* .06 = medium effect 6%)
* >.14 = large effect (14%)

27
Q

Interpreting effect size

A
  • The effect sizes typically observed in psychology may vary from area to area.
  • The levels of the IV used are important in determining the observed effect size.
  • A theoretically important IV may still only account for a small proportion of the variability in the data.
  • A theoretically unimportant IV may account for a large proportion of variability in the data
28
Q

What is power?

A
  • Sensitivity is the ability of an experiment to detect a treatment effect when one actually exists.
  • Power is a quantitative index of sensitivity which tells us the probability that our experiment will detect this effect.
29
Q

What is the ideal power

A

*Keppel (1992) argues that ideally power should be > .80. to ensure an experiment can pick up a moderate effect.

  • Ensuring adequate power is a research design issue
30
Q

Power is a function of…?

(what are the things you can tweak/change that will change your experiments overall power?)

A
  1. The size of the treatment effect (we have better power to detect stronger effects)
  2. The size of the error variance (the more noise in the data, the harder to detect an effect)
  3. The alpha level (the more conservative the test, the greater chances you will reject the H1 even when it is true)
  4. Sample size
    * Greater sample size – greater power
31
Q

How do you know what the power of your experiment is

A

power of your experiment is 1 minus the type 2 error rate

32
Q

Why does sample size effect the amount of power you have?

A

When the df is larger (so larger N and more people) then there becomes less overall error so F becomes larger

33
Q

Why not use the largest n possible?

A
  • Not always cheap or easy to use large samples,
  • We need to know what is the acceptable minimum sample size to pick up a specific effect
34
Q

What are the two main situations we are concerned about power in?

A

When we do not find a significant effect but there is evidence that we may have made a Type II error.

  • When we are planning a new experiment and wish to ensure that we have adequate power to pick up the effect of our IV
35
Q

Power and sample size

A
  • Ideally, we should determine the sample size that will give our experiment adequate power (> .8 ) before we run it.
  • Conducting a study without an indication of the sample required to achieve desired power may be an expensive waste of time
36
Q

What are the ways to estimate the required sample size.

A

To do this you need an estimate of the magnitude of the treatment effect

  • You can get this either from:
  • past research
  • a pilot study
  • an estimate of the minimum difference between means that you consider
    relevant or important (often used in clinical experiments) .
37
Q
A