14 | DW-4 | Power Flashcards by Stevie Davies

Background / p-value interpretation – example with drugs A and B, A seems more efficacious.
What is p-value for?

P-values are numbers between 0 and 1 that quantify how confident we should be that Drug A is different from Drug B
The closer the p-value is to 0, the more confident we are that the drugs are different
It helps us to decide whether to reject the H_0 or not.
A small p-value does not imply that there is a big difference between A and B, the implication is only for the certainty of whethere there is a difference (small or big!) or not.

How well did you know this?

Not at all

Perfectly

Background / p-value interpretation – example with drugs A and B, A seems more efficacious.
What is a commonly used threshold for the p-value? Explain in words what it means for the example.

0.05
If there is no difference between Drug A and Drug B, and if we did this exact experiment a bunch of times, then only 5% of tjese experiments would result in the wrong decision

How well did you know this?

Not at all

Perfectly

Background / p-value interpretation – example with drugs A and B, A seems more efficacious.
What would a large p-value mean eg 0.9?

We fail to see a difference between the two groups

How well did you know this?

Not at all

Perfectly

Background / p-value calculation – example with coin that lands on heads twice in a row.
What would the null hypothesis be?

The coin is not special, it’s no different from a regular coin.
If we reject H_0, we know that the coin is special.

How well did you know this?

Not at all

Perfectly

Background / p-value calculation – example with coin that lands on heads twice in a row.
How are p-values determined? What does that mean for the coin example?

By adding up the probabilities
Determine probabilities for the different outcomes if H_0 is true : heads, heads / heads, tails / tails, heads / tails, tails → each one is 0.25 (which means 1 tails and 1 heads is 0.5 because of different orders)

How well did you know this?

Not at all

Perfectly

Background / p-value calculation – example with coin that lands on heads twice in a row.
What 3 parts is a p-value composed of? (What would they be for the coin example?)

The probability random chance would result in the observation. (.25)
The probability of observing something else that is equally rare. (.25)
The probability of observing something rarer or more extreme. (0)
→ p-value = 0.5

How well did you know this?

Not at all

Perfectly

Background / p-value calculation. You throw a coin 5 times and get 4 heads and 1 tails. Is the coin special? How would you calculate a p-value?

There are 32 outcomes:
All heads – 1 way
4 heads, 1 tail – 5 ways
3 heads, 2 tails – 10 ways
2 heads, 3 tails – 10 ways
1 heads, 4 tails – 5 ways
All tails - 1 way
The probability random chance would result in the observation: 4 heads, 1 tail = 5/32
The probability of observing something else that is equally rare: 1 heads, 4 tails = 5/32
The probability of observing something rarer or more extreme: all heads or all tails = 1/32 + 1/32
→ p-value = (5+5+1+1)/32 = 12/32 = 3/8 = 0.375 → not a special coin

How well did you know this?

Not at all

Perfectly

Background / normal distribution.
How is the width of the curve defined?

Standard deviation

How well did you know this?

Not at all

Perfectly

Background / normal distribution.
How is the standard deviation useful?

95% of the measurements fall between =/- 2 standard deviations around the mean

How well did you know this?

Not at all

Perfectly

Background / central limit theorem
What is the central limit theorem all about?

Means obtained from a distribution (whatever type it is) through random sampling will be normally distributed
No matter what distribution you sample from, the means obtained will be normally distributed (some exceptions eg Cauchy but not used much)

How well did you know this?

Not at all

Perfectly

Background / central limit theorem
What are the practical implications?

We don’t know what distribution our data comes from, but we know the sample means will be normally distributed →
We can use the means normal distribution to make confidence intervals, do t-tests , do anova, and prett much any statistical test that uses sample mean
Rule of thumb: the sample size should be at least 30

How well did you know this?

Not at all

Perfectly

Statistical Power (SQ) |
Example: two sets of mice, some on a normal diet and some on a special diet. Their weights have different distributions (though both normal and both same height/width): special diet has a lower mean, there is not much overlap between the distributions. We collect a small sample from each population.
If we collect a small sample of both populations, we would get a ____ p value which would cause us to ….

Small p-value < 0.05 which would cause us to correctly reject the null hypothesis that both sets of data come from the same distribution.

How well did you know this?

Not at all

Perfectly

Statistical Power (SQ) |
Example: two sets of mice, some on a normal diet and some on a special diet. Their weights have different distributions (though both normal and both same height/width): special diet has a lower mean, there is not much overlap between the distributions. We collect a small sample from each population.
What would a large p-value mean? Is this likely to happen?

If we repeat the experiment a bunch of times, each one should correctly give us a small p-value.
But every now and then we will get a result that does not make it clear that the populations have different distributions because we will have sampled mice from the overlap.
→ we will get a large p-value
→ we can’t reject the H_0, even though H_0 is false

How well did you know this?

Not at all

Perfectly

Statistical Power (SQ) |
Example: two sets of mice, some on a normal diet and some on a special diet. Their weights have different distributions (though both normal and both same height/width): special diet has a lower mean, there is not much overlap between the distributions. We collect a small sample from each population.
What is power? What can you say about the power in this experiment?

Power is the probability that we will correctly reject the null hypothesis
Power is the probability that we will correctly get a small p-value
We have a large amount of power, because we have a high probability of correctly getting a small p-value and being able to (correctly) reject the null hypothesis

How well did you know this?

Not at all

Perfectly

Statistical Power (SQ) |
Example: two sets of mice, some on a normal diet and some on a special diet. Their weights have different distributions (though both normal and both same height/width): special diet has a lower mean, there is not much overlap between the distributions. We collect a small sample from each population.
Does the concept of power apply here? When would the concept of power (not) apply?

It would not apply if the mice were all from one distribution
There is no such thing as ‘correctly rejecting’ the null hypothesis, because the null hypothesis is true.

How well did you know this?

Not at all

Perfectly

Statistical Power (SQ) |
Example: two sets of mice, some on a normal diet and some on a special diet. Their weights have different distributions (though both normal and both same height/width): special diet has a lower mean, there IS much overlap between the distributions, they’re almost the same but not quite. We collect a small sample from each population.
Does power apply here? What can you say about the power?

It’s more likely to get a higher p-value and not be able to reject the null hypothesis
Even if we repeat the experiment many times, most of the time we will get a high p-value
→ when there is a lot of overlap between the distributions and we have a small sample size, we have relatively low power.

Statistical Power (SQ) | How can we increase power?

By increasing the number of measurements we collect
A power analysis can tell us how many measurements to collect to have a good amount of power.

Background
What problem does power analysis solve? Examples?

How many samples to we need to test to arrive at a statistically sound conclusion?
Egs: perform an experiment to test whether or not there is a difference between samples (e.g. drug has an effect, temperature influences gene expression etc.)
Eg: Hypothesis: there is a significant difference in height between men and women in the German population. How many men and women do we need to sample? → Answering this question is said to “power an analysis”
What is Power?
What is power? Basic formula
Power = 1 - β

What is power?
Power = the probability of rejecting __________ if it is ______.

The null hypothesis if it is false

What is power?
Power = probability of __________________ when the null hypothesis is false.

Making a correct decision

What is power?
Power = probability that a test of significance will __________________that is present.

Pick up an effect

What is power?
Power = probability that a test of significance will …

detect a deviation from the null hypothesis, should such a deviation exist.

What is power?
Power = probability of avoiding …

a Type II error.

Power analysis
Power analysis (SQ) |
Example: 2 drugs A and B. We sample and see that people who take drug A seem to recover faster, but the p-value is 0.06 → we can’t reject the null hypothesis that the populations come from the same distribution. How could we use power analysis here? What would the alternative be?

We could do a power analysis to determine what sample size will ensure a high probability that we correctly reject the null hypothesis that there is no difference between the two groups → we’ll know that regardless of the p-value, we used enough data to make a good decision.
The alternative would be to keep sampling until we get a lower p-value → this is wrong! P-hacking.

Power analysis (SQ) | What are the two main factors that affect power?

* How much overlap there is between the two distributions we want to identify with our study. * The sample size, the number of samples we collect from each group