Lecture 5 - Statistical Power Flashcards
What are three examples of post-hoc tests?
Tukey hsd
T test
Scheffé
What does hsd stand for? As in tukey hsd test
Honestly significant difference
What does the tukey hsd test do
It establishes the smallest possible difference between two means
It uses a critical difference value. This comes from a fancy equation.
Any mean difference greater than the critical difference is significant.
How can a t test be used as a post hoc test?
It is modified - t is given by 0.05/c where c is the number of comparisons being made.
This is called a bonferroni correction
What do post hoc tests mean for error rates?
They’re conservative, which means they reduce the chance of type one errors but greatly increase the chance of a type two error
This means we can be very confident when we do find an effect
But it does mean null results are hard to interpret. There may still be an effect but we just can’t find it. (Low power??)
What are the four assumptions of the f ratio
Independence of numerator and denominator
Random sampling
Homogeneity of variance
Normality (normally distributed populations)
How can we test the assumptions of an ANOVA?
Independence and random sampling are down to the experimenter so we assume they’ve been met
But we can test homogeneity of variance and normality
How can we test for homogeneity of variance?
In a between groups design:
Hartley’s F-Max
Bartlett
Cochran’s C
Within or mixed designs:
Box’s M
How would you do Box’s M by hand?
You’d do (largest variance/smaller variance)
What are the three most common tests of normality and how do they work?
Skew
Lilliefors
Shapiro-Wilks
(The bottom two are very hard to do by hand but SPSS has them)
They compare the actual distribution of data to a model of normal distribution
They are all pretty sensitive, and more so with large samples
How do we test skew?
We can test if the skew is significantly different from 0. If everything was perfectly normally distributed skew would be 0.
We use a z-score distribution to do this
If the z score is greater than + or - 1.96 then the sample is significantly different than a normal distribution
What is a transformation and why would we use one?
Mathematical operations that we can apply to the data before we conduct an ANOVA
we use them if we don’t meet the assumptions of an ANOVA but we really want to perform one
What are the three circumstances where no transformations will make the data fit the ANOVA assumptions
Heterogenous (different) variances
Heterogenous distributions
Both of the above
What is defined as moderate, substantial and severe skew?
Moderate - 1.96-2.33
Substantial - 2.34-2.56
Severe - 2.56 and above
What transformation would you use for moderate positive skew
Square root
What transformation would you use for moderate negative skew
Square root (K-X)
In a transformation what is K?
The largest number in the data plus one
What transformation would you use for substantial positive skew
Logarithm
What transformation would you use for substantial negative skew
Logarithm (K-X)
What transformation would you use for severe positive skew
Reciprocal
What transformation would you use for severe negative skew?
Reciprocal (K-X)
How does transforming the data affect error chances?
Increases type one error rate but decreases type two error rate
What do we do if we can’t transform the data? (Eg those three situations)
We proceed with analysis but we take care to caution the reader when we are interpreting the results.
What should you say instead of saying you can accept H0?
You must say you ‘fail to reject the null hypothesis’ it sounds much better and is more accurate!
What is statistical power?
The probability of detecting an effect when one is present (so basically the probability of NOT making a type two error)
It’s given by 1-ß. Where ß is the probability of making a type two error
What are the three things that power depends upon?
Alpha level
Sample size
Effect size
What happens to power if we make alpha less strict? Such as 0.1 instead of 0.05
Power is increased as we are less likely to miss an effect if there is one
But of course type one error chance is increases as a result
How does sample size affect power?
Small samples have less power than large ones
It does plateau out at a point though
What can we do with power to help us plan out study
We can actually use power to calculate the ideal sample size for a piece of research
There are many different formula for this depending on experimental design
How does variability affect power
Power decreases as variability around the mean decreases - you get more power THR further apart two means are (naturally)
What are the measures of effect size for an ANOVA?
Measures of association:
Eta-squared : funny n symbol)
R-squared
Omega squared (boob shape symbol)
Measures of difference:
D
F
What is eta squared?
a measure of effect size for anova
a measure of association
it is the proportion of total variance that can be attributed to an effect
what is partial eta squared?
the proportion of the effect + error variance that is attributable to an effect
it is a measure of association and a measure of effect size for anova
what is r squared
a measure of association used to measure ANOVA effect size
it thinks of ANOVA as a regression like model and it is the proportion of variance that this model is able to explain
What is omega squared
a measure of association used to measure ANOVA effect size
it is an estimate of the dependent variable population variability accounted for by the independent variable
What is d?
a measure of difference used to measure ANOVA effect size
it is used when there are only two groups and it is the standardised difference between these two groups
what is f/cohens f?
a measure of difference used to measure ANOVA effect size
an average, standardised difference between the 3 or more levels of the IV
small effect f=0.1
medium effect f=0.25
large effect f=0.4
What are the two strategies we may use to estimate effect size?
1 - deciding what effect size we want based on previous research in this area (best option)
2 - based on theoretical importance - do we want a small, medium or large effect size theoretically? (not as good)
Sometimes there is no previous research so we cant do 1, we must do 2
what size effect size do we report?
ANY! especially when the result is non significant
define retrospective justification
saying that there is a non significant result because power was low or that there cant be an effect as power is high so we would have found one if there was one
When do I use tukey and when do I use Scheffé
Tukey can only be used when sample sizes are equal between groups
Scheffé can be used either way but normally is used for unequal samples