F8 Regression discontinuity design Flashcards
What is the running variable and cutoff?
Running variable is an observed confounder that determines the treatment status at a specific value/cutoff. Typically continuous.
What is a RDD?
Regression discontinuity design. Exploits a natural cutoff/threshold which distributes units into either control or treatment group.
What does RDD estimate?
Local average treatment effect: LATE.
What is the challenge with LATE?
The effect is limited to the individuals around the cutoff. Generalizability to the rest of the population is not possible.
Internal validity > external validity
What is a key advantage of RDD?
Superior in handling unobserved confounders. It convincingly eliminates selection bias if assumptions holds.
Was is the key assumption?
Continuity. Potential outcome are continous functions of the running variable and smooth passing through the cutoff.
All confounders are assumed to be continuous at the cutoff.
Treatment becomes independent of potential outcomes. D is the only variable that affects the outcome and jumps discontinuously at the cutoff.
No simultaneous treatments and the cutoff cannot be endogenous.
What are examples of RDD?
Test-scores (SAT), geographical boarders, time, close elections and policy changes
What must the assignment rule/cutoff be?
Known, precise and free of manipulation (if the RDD is sharp)
Draw the DAG for RDD (both of them)
Squares.
D –> Y: The causal relation of interest.
X –> Y: Confounder. The running variable affect the outcome (independently from D). Out of influence under RDD
U: Unobserved confounder causing bias. Out of influence under RDD
What is the mathematical formula?
Homogenous effects: Y_i = α + βx_i + δD_i + ε_i (changes only the intercept).
Heterogenous effects: Y_i = α + βx_i + δD_i + ε_i + θD_i x_i (the interaction term lets the function differ on both sides at the cutoff)
What are the potential outcomes framework for RDD?
The thing with limit.
What are the two types of RDDs?
Sharp RDD: Probability of treatment changes from 0 to 1 at the cutoff (deterministic). No common support (relies on extrapolation).
Fuzzy RDD: Gradual increase in probability of treatment. With a minor jump at the cutoff.
What happens with the estimator in a fuzzy RDD?
It’s scaled to the probability of being treated.
Wald estimator (special case of IV estimator - binary outcome and some degree of non-compliance).
Bandwidth
Narrow: Loss of statistical power
Broad: Risk of specification bias
What is the main challenge to RDD?
Sorting.
If the cutoff is known then self-selection into treatment or control is possible. The continuity assumption doesn’t hold up.