Sample Size and Power Flashcards

1
Q

Critical value

A

The estimated effect size that exactly corresponds to the significance level.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

If we are testing whether the effect size is bigger than 0 and is significant at 95% level, then the critical value is…

A

the level of the estimate where exactly 5% of area under the curve lies to the right

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Usually in program evaluation, instead of presumption

of innocence,” the rule is:

A

“presumption of zero”

The ‘burden of proof’ is showing that there was an impact

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Null hypothesis

A

The intervention had no impact

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

When do you reject the null hypothesis?

A

If it is very unlikely (less than a 5% probability) that the

difference is solely due to chance. At this point, we could say “our program has a statistically significant impact”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Type 1 Errors

A

False positive.

5% of the time we will say that a program has impact when, in fact, it did not

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Statistical power

A

The probability that, if the true effect is of a given size, our proposed experiment will be able to distinguish the estimated effect from zero (you find that its significant); in other words the probability of avoiding type II errors

Graphically, its the proportion of this curve that is to the right of the critical value for the null hypothesis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Type 2 Errors

A

False negative.

Traditionally, we aim for 80% power (some aim for 90%)

Low power means we may not find a significant effect even though an effect exists

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are the four possible results from hypothesis testing?

A

No error, when there is no effect = True negative
No error, when there is an effect = True positive
Error, when there is no effect = False positive (Type 1)
Error, when there is an effect = False negative (Type 2)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

When curves overlap, does this suggest higher or lower power?

A

Less power when curves overlap

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How does effect size relate to the size & shape of the distribution curves? How does effect size relate to power?

A

If we expect a small effect size, the curves will be closer together - so power will be lower

If we expect a large effect size, the curves will be farther apart - so power will be higher (larger SE)

Bigger effect size = more power

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What if we have a large effect size, but a low rate of take-up? How will this affect power & the distribution curves?

A

A low take-up will dilute the average effect size. Even if the treatment group experiences a massive effect…if only a few people take it up, then its going to have a negative effect on power.

The average effect size will drop down, curves will be closer or overlapping & we will back to having low power

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What happens to the distribution curves as you increase the sample size? Does this increase accuracy or precision?

A

By increasing sample size, you are increasing precision. The curves will become more narrow - as they are more likely to converge closer to the true average

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How does variance correlate to power?

A

Lower variation will make estimates more tightly clustered, suggesting higher power.

As variance goes down, curves will become more narrow - but they will overlap less & the critical value will be closer to zero = higher power

As variance goes up, estimates will be more disbursed. Curves will get wider and overlap more - meaning the critical value will be much more to the right and power will be lower

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

If there’s not a perfect 50/50 split between treatment and control, what happens to the shape of the curves?

A

Uneven distribution; less power

More efficient to just increase the sample size & make the curves more narrow

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Allocation ratio

A

the fraction of the total sample allocated to the treatment group is the allocation ratio

Usually, for a given sample size, power is maximized when half sample allocated to treatment, half to control

Diminishing marginal benefit to precision from adding sample, so best to add equally

17
Q

What is the key difference in achieving high power in clustered level RCTs vs. individual level RCTs?

A

You need a bigger sample size to achieve the same power

18
Q

What is the key difference in achieving high power in clustered level RCTs vs. individual level RCTs?

A

You need a bigger sample size to achieve the same power

19
Q

Intra-cluster correlations (ICC, rho) - high vs. low ICC?

A

Intra-cluster correlations refer to how similar people are WITHIN clusters. If there are many different types of people within clusters, then the ICC is low. When there are very similar types of people within clusters, you’ll have a very high ICC.

20
Q

Intra-cluster correlations: Definition & equation

A

The proportion of total variation explained by between-cluster level variance

ICC equation = between cluster var / (between + within cluster variance)

21
Q

Total variance definition (two segments)

A

Can be divided into within cluster variance and between cluster variance

When within- cluster variance is high, then within-cluster correlation is low & between-cluster correlation is high

22
Q

How does ICC impact power?

A

For a given N we have less power when we randomize by cluster (unless ICC is zero)

There are diminishing returns to surveying more people per cluster. Usually the number of clusters is the key determinant of power, not the number of people per cluster

23
Q

If ICC is high, what is the more efficient way of increasing power?

A

Include more clusters & people within each cluster - both will increase power.

24
Q

Power with clustering equation

A

The equation is the same for non-clustering power equation; except we have to add this ICC & the number of people within clusters/average cluster size. These two added concepts have a huge impact on power

25
Q

Minimum effect size

A

The aim is to be able to establish the minimum effect size for which we can detect a statically significant impact. Ideally, we want to detect a small minimum detectable effect (MDE) but that depends on the kind of intervention.

26
Q

(t(1-k) + ta)

A

includes t-statistics for the power of the test (1-k) and the type I error (a). As we saw in class, the power of the test is the probability of rejecting the null hypothesis when it is false (finding an effect or true positive) while the type I error is the probability of rejecting a null when in fact there was no effect (false positive). Both terms are related for a given distribution of null and alternative hypotheses. Increasing the power would also increase the probability of type I error while decreasing the probability of type I error would also decrease the ability to detect an effect (power). In order to not compromise too much on both factors we usually define them externally: 1-k=0.8 and a=0.05. We then try to minimize the MDE with the other terms.

27
Q

Sqrt (1/(P(1-P)))

A

as we saw in class, this term is minimized for an equal allocation between treatment and control (P= proportion of sample allocated to treatment). The more unequal the allocation, the larger this term is and thus the larger the MDE (meaning we can only detect large effects). Graphically, this was translated as making a curve fatter than the other.

28
Q

Sqrt (sigma^2/N)

A

as we saw in class, this term is minimized as we increase sample size (N) or decrease the variance. Conceptually, a large variance means it is more difficult to distinguish statistically significant effects if they are too small. The larger the variance, the larger the MDE. Graphically, a larger variance means fatter curves. Conversely, a large sample size makes the estimate converge to the true mean making even the smallest effect statistically significant. Graphically, the curves become narrower allowing us to increase our power and decrease the probability of type I error for a given MDE (meaning we could potentially find a small effect is statistically significant).

29
Q

Sqrt(1+rho(m-1)

A

As we can see, a larger ICC increases the MDE through an increase in the variance. On the other hand, increasing the number of clusters decreases the MDE through a decrease in the variance. When it comes to cluster power calculations, the additional term we add affects the variance in our formula. That is why when we increase the sample size in a cluster design, we should also be attentive of how this increase affects the ICC and the number of clusters added

30
Q

Our intervention, in truth, is effective. However, due to low take-up during our study, we were unable to detect an effect. This is an example of - type 1 or type 2 error?

A

Recall that a Type I error occurs when a study shows an intervention as being effective even though the intervention actually has no effect. This is known as a false positive. A Type II error occurs when a study does not find an intervention to be effective even if the intervention actually has an effet. This could happen due to insufficient take-up, a sample that is too small, the intervention effect being very small, etc.

31
Q

What will increased variation in the underlying population do to our estimates?

A

Reduce precision of estimate

32
Q

What does increased variation in population do to our distribution of estimates (of our null and research hypotheses)?

A

Make them fatter.

The spread of the sampling distribution of estimates is a function of the sample size (and the size of each of the groups) and the underlying population variance. Increased variance in the population increases the spread of the sampling distributions by increasing the standard error, thus making them “fatter”.

33
Q

All uneducated people live in one village. People with only primary education live in another. College grads live in a third, etc. ICC() on education will be…

A

The ICC is a measure of how similar individuals within a cluster or unit are on a given outcome. In this case if one were to conceive of the village as a cluster of individuals, the ICC on education is likely to be quite high since people of similar education levels live in the same village, with very little heterogeneity within villages.

34
Q

If ICC is high, and we could add 100 more individuals, what would increase power by more?

A

Including more clusters in the sample.

If ICC is high, adding an individual to a cluster would be akin to adding less than one individual to the sample since a large part of the variance in outcomes would likely be explained by the fact that individuals within a cluster are similar. Adding more clusters is likely to increase power by more since a new cluster introduces different individuals to the sample who are less likely to behave like individuals currently in clusters in the sample.

35
Q

Increasing take-up (compliance) of our program will move distribution curves how?

A

Move curves further apart

Take-up affects the effect size that one is able to observe. Lower take-up reduces the ITT estimate. Recall that the effect size affects how far apart the treatment and control group (i.e. research versus null hypothesis) sampling distributions are; a larger effect size means that the distributions will be further apart. Thus, increasing take-up pushes the curves further apart by increasing the observable effect that the program is likely to produce.

36
Q

Our study is in an area of 100 villages. Initially, our NGO was going to work in 50 villages, and the remaining would be in the control. However, due to budget cuts, it can now only operate in 40 villages. We have not yet randomized. What will this likely do to our power (assume the variance in both groups is equal)?

A

It will reduce our power

Recall that for a sample of a given size, power is maximized by an equal spit between the treatment and control groups (i.e. a 50:50 allocation ratio). Even if one were to maintain the sample at 100 villages by now having 40 villages randomly assigned to the treatment group and 60 villages to the control group, power would thus be less than it would have originally been if 50 villages were assigned to each of the two groups.

37
Q

In a cluster randomized evaluation, doubling the sample size will

A

….not enough information

Without knowing how the extra sample is split between additional individuals in clusters versus additional clusters (or some combination of the two), beyond making it likely that this will reduce the standard error, it is difficult to know what exact impact it would have on the power of the study, or the standard error of the sampling distributions.