Important keywords for each method Flashcards
How does matching, RDD, IV, DiD and panel data handle confounders (observed and unobserved?)
Matching: Utilizes observed confounders to match control and treatment (needed). Cannot deal with unobserved.
RDD: Assume continuity in both observed and unobserved confounders.
IV: Assumes independence with the instrument for both observed and unobserved confounders.
DiD: All timeinvariant confounders are accounted for because of first-differencing. All timevariant confounders are assumed to have parallel trends. So DiD account for common time-variant effects.
Panel data: All timeinvariant confounders are accounted for. Timevariant confounders pose a problem.
What is the temporal dimension in matching, RDD, IV, DiD and panel data?
Matching: A confounder can have temporal dimension
RDD: Temporal running variable (policy implementation date)
IV: Instrument must be measured prior in time to key independent variable
DiD: Simplest form of panel data - two times periods minimum
Panel data: Relies on a temporal dimension for causal identification.
What are Jans three pet topics?
Try different methods, approach, measurements and see if you can estimate the same effect. Arrive at the same conclusion in many ways (just like Angriste & Pischke).
Studies generally under-theorize confounders. You need to argue that the residual will be systematically random and not just random.
When equation include a squared term, we’re penalized for outliers.
What is an empirical example for matching, RDD, IV, DiD and panel data?
Matching:
RDD: Your synopsis + SAT scores and university on earnings
IV: Distance to school as instrument (Card 1995) for earnings or returns to military service (Angrist 1990)
DiD: John Snow (cholera London) or Card & Krueger 1994 (PA/NJ minimum wages)
Panel data:
Multilevel model: Forecasting US
What type of effect is estimated in matching, RDD, IV, DiD?
Matching: ATE or ATT (depending on common support)
RDD: LATE
IV: ATE (homogeneous) and LATE (heterogeneous)
DiD: ATT
Panel: ATE
What is important to mention for DiD
Assumption: Parallel trends
Different kinds: 2x2, triple DiD, staggerede DiD and two-way fixed effects.
DiD is a subtype of two-way fixed effects regression and therefore a simple form of panel data
What is important to mention for matching?
Define closeness: Exact or approximate matching
Define matching method: Nearest neighbor or propensity score matching.
CIA + common support
What is important to mention for IV?
Two kinds: Homogeneous (exclusion restriction + nonzero first stage) and heterogenous (SUTVA + independence of treatment + monotonicity)
Two important assumptions:
(1) Exclusion restrictions Cov(Z , u_i) = 0. Difficult to argue
(2) The strength of the first stage/relevance: Cov (Z , X) different from 0
Deals with observed and unobserved confounders.
Weak instruments will be inconsistent and have high SE.
The reduced form - Cov(Z , Y) is a must for an effect but Z cannot affect Y in it self (the only through)
What is important to mention for RDD?
Sharp (deterministic) and fuzzy (probabilistic)
RDD assumes continuity in potential outcomes around the cutoff (as if randomization around the cutoff).
Requires absence of precise manipulation/sorting and simultaneous treatments.
Potential weakness: Bias/variance trade-off with bandwidth and specification. Data greedy.
Strength: Resembles an experiment in regards to internal validity.
Effects are very LATE
What different element can you use as a identification strategy?
Time: FE, DiD
Instrument: IV
Discontinuities: RDD
CIA: Matching
What is important to mention for dynamic panel data?
Respect the DGP. Autocorrelation in the dependent variable. Wawro.
Including a lagged variable –> correlation with residual
First difference –> Fixed effects out but still correlation with residual
Anderson Hsiao: Simplicity
GMM: Many instruments. Increase efficiency and bias
What is important to mention for panel data?
Multiple units measured at multiple times.
Identification strategy: Fixed unit effects
Serial correlation (cluster robust SE)
What is the difference between internal and external validity?
Internal validity means our strategy identified a causal effect for the population we studied.
External validity means the study’s finding applied to different populations (not in the study)