Why Randomize Flashcards
The primary purpose of showing that different methods produce different results is that:
….which method we use matters
It is not the case that any one method is more or less likely to produce an accurate estimate of impact - that will depend on the reasons for, the extent, and direction of those biases. Rather, what the demonstration of different estimates produced by different methods indicates is that what method one uses matters! Depending on the context, program, and availability of data, some methods may make more sense than others. What is important to note is that each non-experimental method may control for some biases and not them all, and therefore may create different impact estimate ones finds.
In the balsakhi example, for a pre-post method to produce a valid measure of impact, the key assumption is that
Student test scores would not have changed over time in the absence of the program
Since the comparison is being made between Balsakhi students before and after the program, the key assumption is that absent the Balsakhi program, those students’ test scores would not have changed over the time period that the program was implemented, i.e., any change in scores is attributable to the Balsakhi program.
In the balsakhi example, for a (non-randomized) simple difference method to produce a valid measure of impact, the key assumption is that
Other than receiving the program, students are similar along observable and unobservable characteristics
Since the only comparison being made here is between post-program scores of Balsakhi students and non-Balsakhi students, it is not necessary to assume that student test scores would not have changed over time even absent the program. However, since the counterfactual to Balsakhi students is constructed using students who did not receive the program, it is necessary to assume that those students are otherwise similar to Balsakhi students along observable and unobservable characteristics. Otherwise, it might be the case that any differences in scores observed between the two groups might be due to some other characteristics along which the two groups differ.
Selection bias is a concern with what types of evaluation methods?
Non-randomized simple difference & the difference-in-differences method
Selection bias is a concern when the comparison group is made up of individuals with different “potential” (i.e. likely) outcomes as those in the treatment group, usually because of different underlying characteristics. Since a pre-post evaluation looks only at program recipients, comparing their outcomes before and after the program, their underlying characteristics do not change, and so it is not affected by selection bias (although there may be other factors introducing bias into the estimates of impact). By randomly assigning who receives a program and who doesn’t, a randomized evaluation is able to dispense with selection bias by balancing on observable and unobservable characteristics.
When we take our study sample, and randomly assign individuals to a treatment and control group:
we do not expect individuals in the treatment group to be statistically different from the study sample
we do not expect individuals from the control group to be statistically different from the treatment group
we do not expect individuals from the control group to be statistically different from the study sample