Week 3 Flashcards
What are the 3 assumption of ANOVA?
- homogeneity of variance
- normality of scores
- independence of observations
How can we test the assumption of normality, non statistically?
plot
How can we test the assumption of normality, statistically? What does it mean if this test is significant?
Shapiro Wilk.
If significant, this is BAD - significant deviation from normative distribution.
What is one way to see what is going wrong in a violation of the assumption or normality?
Quantile-quantile plots using skew
straight line is good - check slides for kurtosis evaluation
What is homogeneity of variance?
Equal variation in each of the groups in your study (i.e., equal standard deviation).
How can we test homogeneity of variance?
Using Levene’s test
What does it mean if we have a significant homogeneity of variance score, a SIGNIFICANT Levene’s test?
If p is significant, is BAD. Your data does not meet the assumption of homogeneity.
When would hefty violations of the homogeneity of variance (using Levene’s test) be okay and why?
If you have the same amount of people in each group, this will still give you trustworthy results.
If you do get a homogeneity of variance violation (using Levene’s test) you might want to hunt down where this violation is occuring. How can we do this?
Looking at density plots.
What if we are not sure about violations of the assumptions of homogeneity of variance and normality?
Then maybe do a non-parametric version of analysis.
If we are not sure about violations of the assumptions of homogeneity of variance and normality, we do a parametric version of the analysis, how do we interpret the results if they ARE similar?
If the results are similar to the ANOVA, we can trust the results.
If we are not sure about violations of the assumptions of homogeneity of variance and normality, we do a parametric version of the analysis, how do we interpret the results if they are NOT similar?
Then we have to say we are not that confident with the results of the ANOVA.
How do we practically make sure not to violate the assumption of independence of residuals?
By allocating people to different groups and making sure they don’t interact too much.
When will the assumption of independence of residuals be violated? (2)
If the same participants take part in each of the IV conditions.
Violated if there are relationships between people in groups.
How do we test for the assumption of independence of residuals?
We don’t usually, we build it into the design.
What is power in statistics?
The power of a statistical test is defined as the probability of correctly rejecting the null hypothesis. Therefore, it’s the probability of finding a difference between means, if it is there.
What is a type 1 error?
Accepting a relationship as significant when in fact it is due to something else (bias, sampling error, or chance).
What is a type 2 error?
Refers to when a Ho is false, but we have decided to retain it.
When can a type 2 error occur?
When we have set the confidence limit too strictly (e.g., a
Why is it difficult to uncover a type 2 error?
Because we have to know the true variability of the population for the treatment group.
What is the model of power?
Power = 1 - (the chance of making a type 2 error)
What is another way to define power?
Probability to replicate a genuine significant finding using exactly the same treatments and the same population of subjects.
What is a priori power estimate?
An estimate of power BEFORE we start the experiment. Helps ensure we have adequate numbers of participants.
What is a post-hoc power estimate?
What can the value size be between?
Estimates power AFTER the experimental data has been gathered: estimates the likelihood of being able to replicate results. This adds validity.
0 and 1.
Which type of power estimate is easier to calculate?
post hoc
What program can you use to determine post hoc and a priori power estimate?
G power
What do you need in G power in order to calculate power estimates?
means, average variance (MSE) and sample size.
Estimation of power usually begins with an estimation of:
effect size.
What are two ways to figure out a priori power estimate?
look at previous studies to look at their effect size .
If it is an entirely new study, how do we go about getting a priori power estimate?
We might say that we want a large effect for the treatment to be considered useful, or moderate, then figure out the same size we need.
The most useful aspect of estimating a priori power is at the ____ stage to make sure that the study will be powerful enough to detect a required degree of difference between the groups.
Design.
Of the factors affecting power (sample size, variance, magnitude of difference etc), what is considered to be the most readily manipulated?
sample size
What makes you type 1 error greatly increase?
Doing lots of tests.
What is the per comparison error rate
The probability of making a type 1 error on any given comparison. We set this at alpha. (p
What is a family wise error rate?
The probability that a ‘family’ of comparisons will contain at least one type 1 error.
What are pre programmed contrasts?
Pre programmed contrasts in Jamovi allow you to only test the comparisons that you absolutely have to.
If each per comparison error rate is 5% (p<0.05), what is the error rate of 4 tests for example?
5% X 4% = 20% or 0.20.
How do we calculate the Bonferroni adjustment?
probability you want to keep the family wise error rate at (5%) divided by the amount of comparisons you are doing
Bonferonni is really simple to do, however, it does increase your chance of making what?
Type 2 errors
Benjamini and Hochberg thought that Bonferroni may’ve been too conservative. Instead of looking at the family wise error rate, they suggested looking at what?
The false discovery rate
What is the false discovery rate?
What proportion of the significant results are false discoveries (type 1 errors)?
How do we calculate the false discovery rate?
new critical p = (order of comparison make, so if you’ve made 4 you rate them in order of significance (t value) from lowest to higher DIVIDED by number of comparisons made)…… TIMES by the desired false discovery rate, typically 0.05
P=(i/c)a
The Tukey HSD test for difference between all pairs of means:
Good for equal number of people within each group,
good for between 3-5 levels. Controls for the over type 1 error rate independently of whether an F test is significant, but loses some power in the process.
The Ryan Procedure:
Retains more power than Tukey, but doesn’t work so well if unequal number of participants in each group.
Holm uses a similar approach to ____ but the critical value ___ according to the index of the compaison
Bonferroni
changes
This is calculated automatically in Jamovi.
What is the Scheffe test for complex comparisons and data snooping (don’t really have a hypothesis)?
-Controls the family wise type 1 error rate at a for all possible linear contrasts (so loses a lot of power)
Very conservative and not recommended for simple pair wise comparisons.
If small number of priori contrasts, use what for multiple comparisons:
Bonferroni adjusted t-tests
IF testing several post hoc comparisons, use what for multiple comparisons?
Tukey or Holm.
If there are unequal sample sizes or there is a violation of the homogeneity of variance assumption (or robust t-tests), which test of multiple comparisons should we use?
Games - Howell
Only resort to Scheffe if examining:
multiple, complex, and post hoc (unplanned) comparisons
Is Holm or Tukey slightly better at protecting against type 2 errors?
holm, however most people use Tukey.