Week 3: ANOVA assumptions, power and mean comparison Flashcards
what is the underlying ANOVA equation?
Score = Grand mean + treatment effect + residual error (measurement and individual differences)
Essentially, how does ANOVA break down variance? and what does the comparison of this breakdown produce?
Into 2 parts
- Variance due to treatment
- Error variance
Comparison of these two variances produces the F STATISTIC
What are the assumptions of a between ANOVA?
Homogeneity of variance: SD for all groups is about the same
Normality: error is normally distributed
Independence of observation: truly between design
How do you test the assumption of homogeneity of variance?
Levenes test
- if statistically significant the null is rejected. this means that there is a difference somewhere do homogeneity is violated
What do you do if Levenes test is significant?
You can look up density plots to find the culprit
can be violated without grave consequences as long as sample sizes are equal
What can you use to test the assumption of normality?
Shapiro WIlk
If significant, assumption of normality is violated
can also look at histograms, distribution plots, skewness and kurtosis
What happens if normality is violated?
Typically its not a big issue
ANOVA still tends to be robust in terms of normality violations
How do you report that you have used a non-parametric version of an ANOVA?
‘Similar results were found using non-parametric….’
What are quantile-quantile plots?
They look at normality assumption.
Chop data up into how many scores are in each quantile and compare it against what we would expect in that quantile for a normal distribution.
Straight line = perfect
Deviations = bad
What happens to the ANOVA if assumptions are violated?
If you arent confident, can use non-parametric versions of the test
This can then increase confidence in results
What is skewness of data?
Deviations from typical bell curve of data distribution
- asymmetrical
Explain positive and negative data skewness
Negative skew: Have a long tail towards low values in the data (more low scores than expected)
Positive skew: long tail towards high values (more high scores than expected)
No skew value?
0 - perfectly symmetrical
What values are considered moderately skewed?
Between -1 and -0.5
or
Between 0.5 and 1
What values are considered highly skewed?
(-/+) 1+
Skewness in Quantile-Quantile plots?
If data drops below the line, it is skewed to the left or negatively skewed
If data rises above the line, it is skewed to the right or is positively skewed
What is kurtosis?
It is a measure of tailedness in data distribution curve
how light or heavy-tailed the data is
What is leptokurtic distribution (kurtosis)?
People scoring closer to the average (light tailed)
What is platykurtic distribution (kurtosis)?
Wider spread of scores so maybe more heavy tailed
What is the optimal kurtosis value?
It is reported in terms of how much excess skew there is - so the optimal would be 0!!!
How do you know when to reject normality in terms of kurtosis?
You are given a standard error as well as a skewness value - these can be used to determine a z-score
z score = Value of kurtosis/standard error
If z score is less than -2 or more than +2, reject normality
What is power?
The probability of correctly rejecting the null
finding a difference between means if it is there
What is power associated with?
Type 2 errors
when do type 2 errors typically occur?
when alpha level is too strict
or
outlier increases error variance so hard to see treatment effect clearly
What is the power equation?
power = 1-B
The probability of finding a real difference = 1 - the probability of not finding a real difference
What increases power?
- Increase in magnitude of difference between means (effect size): makes it more obvious
- Alpha level (more generous): increases power but also increases type 1 errors or false positives
- variance in scores: less variance increases power as if too much variance, makes it harder to see effects clearly
- larger sample size: gives more accurate means - more power as can see more clearly
A priori power estimate?
Estimate before the experiment is run
- helps to ensure have adequate numbers of participants to detect any real differences between treatments
Post hoc power estimate?
Estimate after data has been gathered
Estimates the likelihood of being able to replicate a significant difference if repeated
- value between 0 and 1
eg 0.82 - 82%
How can you obtain a post hoc power estimate?
G*power - REFER TO SLIDES
Can provide you with a power estimate if you feed it means, average variance (MSE) and sample size
- Tell it what kind of test you’ve done
- gives an effect size that can then be used to find power
How do we find a priori power estimates?
Estimation of effect size (cohens d or f) - or you can use the effect size from other studies
With this effect size you can compute a non-centrality parameter (used to find the power of an experiment for any effect size)
Can then plan sample size etc (can use g*power again)
What happens when a statistically significant comparison has more than 2 means?
Need further tests to determine which of the means actually differ from each other
- typically multiple comparisons
What are the two approaches to follow up tests?
A priori: chosen before data is collected
Post hoc: planned after data is examined
Errors in multiple comparison tests?
Major type 1 error rate risks
Per comparison error rates vs. family-wise error rates?
Per comparison:
the probability of making a T1 error on any comparison - this is alpha
Family-wise:
the probability that a family of comparisons will contain at least 1 T1 error
- per comparison rate stacks up
- error rate per comparison X number of comparison
A priori comparisons?
Usually when have very specific hypothesis - creates a smaller number of comparisons so T1 probability is reduced
- contrasts
How to do a priori comparisons in jamovi?
These are ‘constrasts’ and this is a tab within jamovi under the analysis set up
What are the different types of contrasts?
Deviation - each level to grand mean
Simple - mean of one to each other one
Difference - mean of level to previous
Helmert - mean of level to all subsequent combined
Repeated - mean of one to each subsequent individually
T1 error rates in contrast tests?
Need to do bonferonni adjustments
alpha/number of comparisons
Critics to bonferonni?
Claim it may be too conservative and we may make T2 errors instead and miss actual effects
What can we do here if we are worried that bonferonni is too conservative?
Linear step up!
What is linear step up?
Focuses more on false discovery rate (FDR)
Rank comparisons according to their p values - divide this rank by the number of comparisons
Then multiply this by FDR which is alpha
Gives you a critical p value to determine significance against
What other post hoc tests may be appropriate when comparing means?
Tukey or Holm