Regression discontinuity Flashcards
Internal and external validity of RD?
RD design is generally regarded as having the greatest internal validity of all quasi-experimental methods. Its external validity may be quite small, since the estimated treatment effect is local to the discontinuity.
4 assumptions needed to hold for RD estimation to be causal?
- The assignment variable, x_i, (often also called the “running” or “forcing” variable.) cannot be caused by or influenced by the treatment. In other words, the assignment variable is measured prior to the start of treatment or is a variable that can never change.
- The cut-point is determined independently of the assignment variable (i.e. exogenous), and assignment to treatment is entirely based on the assignment variable and the cut-point.
- Nothing other than treatment status is discontinuous in the analysis interval. That is, there are no other relevant ways in which observations on one side of the cut-point are treated differently from those on the other side.
- The functional form representing the relationship between the rating variable and the outcome, which is included in the estimation model and can be represented by f(x_i), is continuous throughout the analysis interval absent the treatment and is specified correctly.
Why is it important that individuals cannot manipulate the assignment variable in the vicinity of the threshold?
so that being below or above the cutoff is (almost) random.
If when selecting students for a scholarship, the selection committee can look at which students received high scores and set the cut-point to ensure that certain students are included in the scholarship pool, or can give scholarships to students who did not meet the threshold–is this ok for RD?
No–violation of assumption 2
Write out using potential outcomes framework: ATE at cutoff point for RD?
=E[Y_1i-Y_0i |x_i=c]
How is RD different from an RCT?
First, the functional form of the curves E[Y_0i |x_i] and E[Y_1i |x_i] need not be flat (or linear or monotonic)
Second, it may be possible for units (individuals, etc.) to alter their assignment to treatment by manipulating the forcing variable in a way that is not possible when it is assigned at random by the investigator.
One plausible approach to finding differences for those right above and right below the cutoff would be to just take a difference in means-what’s one advantage of this approach? One challenge?
Advantage-don’t need to know functional form
you need a lot of data for those people just above and just below cutoff
In this OLS formula, what is the causal effect of interest(Ti is treatment and xi is assignment var)
Yi=B0+B1Ti+B3xi+ei
B1
True or false: in RD, you need to recenter x so that it equals 0 at the cutoff
true-centering x at the cutoff ensures that the coefficient on T is the treatment effect
Is it possible to allow for different functions on each side of the cutoff? If so, how?
Yes-include interactions between x and T
What are two drawback of using OLS to estimate discontinuity?
Global estimates of the regression function use data far from cutoff
Might look like a discontinuity when really you just have the wrong functional form
What is one way to reduce the likelihood of detecting a jump when there isn’t one?
Look at data only in neighborhood around discontinuity
What are implications for bias and variance of reducing bandwidth?
Decreased bias but increased variance
What happens to bias and standard errors if you use extra regressors?
Likely decrease in bias (if you have a large bandwidth) and reduction in standard errors (greater precision)
If the probability of treatment at the cutoff changes discontinuously at the threshold, but not from 0 to 1, can you still use RD?
Yes–fuzzy RD
What does the fuzziness refer to in fuzzy RD?
change in probability of T
In sharp RD, how do we obtain causal estimate?
estimate jump in outcome at cutoff
In fuzzy RD, how do we obtain Wald estimate?
jump in outcome/jump in probability of T
What is a Type I fuzzy RD?
some T members don’t get T (no-shows)
What is a Type II fuzzy RD?
Some T members don’t get T, and some C members do get T (cross-overs)
If Pr(T|xi=c)=.8, what kind of RD is this?
Fuzzy, Type I
If we accidentally estimated a fuzzy RD using a sharp RD design, what kind of estimates do we get?
Intent to treat==average impact for those offered, whether or not they participated (includes never-takers and always-takers)
How do we recover the treatment effect using fuzzy rd?
jump in outcome-assignment relations/jump in T status-assignment relatinoship
What kind of effect are we able to estimate using RD?
LATE=impact of program on individuals who were assigned and participated
If Ti=1 but Di=0, what do we call this?
always-taker
If Ti=0 but Di=1, what do we call this?
never-taker
What are the steps in fuzzy RD?
Estimate first-stage using OLS
Use predicted values of T in place of T in second stage
What is the parametric approach to ensuring that functional form is specified correctly?
Try out a number of polynomial functions and choose the one that fits the data best