Regression Discontinuity Flashcards
What is the basic idea behind a regression discontinuity
Yi = α + βDi + εi
human behaviour is constrained by rules like tests scores, dat cut offs and emisison regusltoin - can be a useful tool to address selection bias
What is the treatment Di relation to Ri
Di is a discontinuous function of an observed running variable Ri such that
Di = 1 if Ri>/= c
Di =0 if Ri</= c
can use this cut-off to design the treatment group
When are the treatment and control group observable in a regression discontinuity?
Treatment group - Y1(r) when Ri>/= c
Control - Yo(r) when Ri<c
counterfacutal is unobervable but if there is a smooth trend we can continue the pattern to find the counterfactual
Di is a function of Ai such that…
treatment status is a deterministic function of variable a - so that once we know a we know Di
treatment status is a discontinous function of a, because no matter how close a gets to the cutoff, Di remains unchanged until the cut off is reached
How is RD different from IV?
RD uses the following regression:
Ya = α + βDa +γa +εa
Da is solarly determined by α
by controlling for α - NO omitted variable correlated with Da in the error term
IV uses two regression like
Reduced form
Yi = α + ρZi +εi
First stage
Di = α + φZi + μi
Di is determined by Zi but it doesnt have to be solely determined by Zi
indepdence assumption Zi is uncorrelated with the error term - stronger assumption that RD
What is a robustness test for RD?
- trying different polynomial orders
- narrow the window
-assign higher weights to data points closer to the cutoff - placebo tests - outcomes unlreated to the cutoff
If Di is a linear function of ai what would it look like in a regression?
Ya = α + βDa +γa +εa
If Di is a non-linear function of ai what would it look like in a regression?
Ya = α + βDa +γa + ρa^2 + εa
How can the running variable be manipulated?
if the program participants can precisely influence the running variable and know the program assignment rule
violates the treatment status being solely determined by the running variables
eg. poverty scores and mean testings
running under the assumption that the running variable cannot be manipulated but not always the case
RD Late?
local average treatment effect - espeically true for RD - the closer we get to the cut off the more credible and valid our results are - our results are only valid within the window
assumption is that the treatment is only determined by the running variable
rd is a localised randomised experiment - d is as good as randomly assigned as long as you can control for the running variable
Validity or rd?
Internal validity might have to sacrifice some external validity.
if big sample size may be able to a claim a v credible casual effect - very local as a result of reducing selection bias and get rid of causal effect