L9 - Statistical Power 2 Flashcards
(31 cards)
How would we plan an a priori power analysis?
(4 steps)
Use the relationship of sample size (N), effect size (ES), and our specified alpha (α) to calculate how many participants we need:
Decide on your α-level (Type I error rate)
Decide an acceptable level of power (Type II error rate)
Figure out the effect size (ES) you are looking for
Calculate N
Why would we not set our level of power at .99 when designing our study?
ur alpha (α) level is fixed, and our effect size (ES) is fixed, which means we would need to have an incredibly large sample size (N)
the higher our desired power level, the more participants we need
If we want higher power and have fixed effect size and fixed alpha, what would this mean for our sample size (N)?
We would need a larger sample size
We don’t actually know the effect size before we run the study, so how can we determine the effect size before we run the study? (3 ways)
1. Base it on substantive knowledge
i.e. What you know about the situation and scale of measurement
2. Base it on previous research
What have others in your field used?
3. Use conventions
i.e. what is usually used
What is the difference between a two-tailed and a one-tailed test?
Two-tailed, p value is split between the front and back (2.5-2.5)
One-tailed, only looking at significance in one way (5%)
If we are doing a 2 tailed test, and the results are only just out of the significance range, can we change to a 1 tailed test post-hoc since we could have done so at the beginning anyway?
No, it is bad science and a type of p-hacking
Cohen believes that an acceptable level of power is ___. This means we are happy to accept a type 2 error rate of ___.
This is a convention for an acceptable level of power that many use.
Power = .8
Type 2 = .2
Which of these is the null distribution and which is the alternative distribution and why?
Null is on the left, for the significant range is in the (alpha ) zone
Alternative is on the right, because the type 2 error zone is denoted by ß
What 3 statistics do we need to know before we can conduct a post hoc power analysis?
- What was the sample size?
- What was the a level (type 1 error rate)?
- What was the effect size (ES)
* With these 3, we can determine the power of the study.*
* Example in notes and W9 powerpoint*
When writing up your discussion and you didn’t have enough power, should you write that
- You needed more power or
- How big a study you would need to have acceptable power?
- How big a study you would need to have in order to have acceptable power.
What does this graph tell us about the level of power and the correlation significance?
There is an assumed power of .8
With a sample size of 20, we can detect correlations of about .6
40 of about .4
etc.
It is really important to think of statistical power in terms of the ____ to the things that determine it.
Relationships
The more stringent the significance level the ____ the necessary sample size
Greater
All other figures being equal
The smaller the effect size the ____ the necessary sample size
Larger
Why do we need bigger samples to detect smaller effects?
Because each of the 4 stats (N, ES, alpha, power) are related to power
- If power is constant at .8, alpha constant at .05, specify a small effect, that means we need a bigger sample than if we specified a big effect.*
- It’s an “all else being equal scenario” as all things are related to each other.*
- Important I think*
The higher the power required the ___ the necessary sample size
Larger
The smaller the sample size the ___ the power
Lower
What is the risk of having journals that demand statistically significant results and low powered studies?
Articles that gets published being type 1 errors
This is the replication crisis
What are the 4 relationships that are all related to each other in power analysis?
Power
Effect Size
Sample Size
Significance level
What is power analysis good for?
1. Sample size calculation
Before you begin your study
2. Evaluation of study results
- Published literature*
- • Useful when planning studies – ES estimates*
- • Replicability?*
- Your own results*
- • What power did your study have (be careful here)*
- • Given the observed ES, what size study to reliably detect? (more useful)*
- Was my study too small? Or too large?*
- • Ethical implications?*
Why can having a sample size that is too small be considered unethical?
A too small sample size means that you will never have enough power for your results to be relevent.
This means you have wasted the participants time, unethical.
One limitation of power analysis is that it uses the frequentist NHST decision making framework.
Why is this a limitation?
If you’re using power analysis outside this framework, none of it makes sense
The concepts of Type I and Type II error at the heart of power analysis depend on what many consider to be the pathological nature of NHST.
Why is this so?
Because we are basing it off an arbitrary level of alpha. This is an arbitrary cutoff point.
- Why should we treat p = .051 differently to p = .049?*
- One implies that one study is meaningful, whereas the other is meaningless.*
What is more important for significance, effect size or p values?
Effect size
P values are arbitrary - whereas effect size is much more meaningful, particularly when standardised