RD Flashcards
NATE
E(Y1 I D=1) - E(Y0 I D=0)
ATT
E(Y1 I D=1) - E(Y0 I D=1)
ATC
E(Y1 I D=0) - E(YO I D=0)
ATE
ATE WITH DIFFERENT GROUP SIZES
E(Y1) - E(Y0)
π x ATT + (1- π) x ATC
BASELINE BIAS (BB)
E(Y0 I D=1) - E(Y0 I D=0)
DTEB
(1-π) x [[E(Y1 I D=1) - E(Y0 I D=1)] - [E(Y1 I D=0) - E(Y0 I D=0)]]
TOTAL BIAS
NATE - ATE
BB + DTEB
IPW
Dx1 / P(D=1, X=X) + (1-D)x1 / P(D=0, X=x)
MÅSTE RÄKNA UT IPW FÖR VARJE OUTCOME FÖRST, EX ANTAL (Y1 I D=1, X=1) = 4
TA HELA X=1 PO =6. IPW= 4/6
FÖR ATT FÅ RÄTT, VÄND PÅ DET, RÄKNA UT 6/4 ISTÄLLET
RÄKNA UT FÖR ALLA SEPARAT
(Y1 I D=1, X=1)
(Y1 I D=1, X=0)
(Y0 I D=0, X=1)
(Y0 I D=0, X=0)
SEDAN FÖR ATT FÅ E(Y1) = (ADDERA ALLA OBSERVATIONER FÖR (Y1 I D=1, X=1) x IPW FÖR (Y1 I D=1, X=1) + ALLA OBSERVATIONER FÖR (Y1 I D=1, X=0) x IPW FÖR Y1 I D=1, X=0)) / (ANTALET OBSERVATIONER FÖR (Y1 I D=1, X=1) x IPW FÖR Y1 I D=1, X=1 + ANTALET OBSERVATIONER FÖR (Y1 I D=1, X=0) x IPW FÖR (Y1 I D=1, X=0)
GÖR SAMMA MED IPW FÖR Y0, ATE ÄR DÅ RESULTATET VI FÅTT
E(Y1) - E(Y0)
INTERNAL VALIDITY
ABILITY TO IDENTIFY CAUSAL EFFECT IN STUDY SAMPLE
EXTERNAL VALIDITY
ABILITY TO GENERALIZE RESULTS TO OTHER CONTEXTS
ATE (EFFECT STANDARDIZATION)
[E(Y1 I D=1, X=x) - E(Y0 I D=0, X=x)] x P(X=x)
CONFOUNDER
WHEN BOTH OF (Z)s ARROWS POINT OUTWARD
AKA: Z AFFECTS OTHER VARIABLES
MEDIATOR
WHEN A VARIABLE IS ON THE PATH BETWEEN D & Y
AKA: ONE ARROW POINTS IN TO (Z) AND ANOTHER OUT OF (Z)
COLLIDER
WHEN ARROWS POINT IN TO (Z)
AKA: OTHER VARIABLES AFFECT Z
MEANS THE PATH IS CLOSED
DIRECT PATH
AN OPEN, DIRECT, CAUSAL PATH BETWEEN D AND Y
NON-CAUSAL OPEN PATH
WHEN THERE IS AN OPEN PATH BETWEEN D AND Y BUT IT’S NOT A CAUSAL PATH, AKA A BACKDOOR PATH
CONDITIONING
INTRODUCE INFORMATION ABOUT A VARIABLE, CLOSES OR OPENS PATH (IF COLLIDER OR NOT)
HOW TO ESTIMATE COUNTERFACTUALS
TAKE MEAN FROM OBSERVED RESULTS:
EX, (Y1 I D=1) MEAN IS 4, MEAN FOR (Y1 I D=0) IS 4 FOR ALL OBSERVATIONS
INTENT TO TREAT (ITT)
E(Y I Z=1) - E(Y I Z=0)
P IN EQUATIONS
THE SIZE OF THE GROUP
EX, WHOLE STUDY POP IS 10, 6 OF THEM ARE X=1. P= 6/10
LOCAL AVARAGE TREATMENT EFFECT (LATE)
[E(Y I Z=1) - E(Y I Z=0)] / [E(D I Z=1) - E(D I Z=0)]
E(Y1 - Y0 I COMPLIER)
FÖR ATT RÄKNA UT ANTALET COMPLIERS, HUR MÅNGA SUBJECTS FICK TRETMENT I BÅDA GRUPPERNA, OM 90% AV DE I T-GROUP BLEV TREATED ÄR DE 90%, OM 40% AV DE I C-GROUP BLEV TREATED ÄR DE 40% (RÄKNA BARA TREATED, DISREGARD CONTROL)
EX, Y1=9, Y0=5,
LATE= (9-5) / (0,9 - 0,4)
METHOD OF BOUNDS
RÄKNA UT ATE(MIN) OCH ATE(MAX) OM VI INTE VET VAD ALLA SUBJECTS GJORDE
ATE = (Y1 - Y0)
IN TREATMENT GROUP 60% Y=1 , 20% Y=0 AND 20% Y=?
I CONTROL: 20% Y=1, 60% Y=0 AND 20 Y=?
FÖR ATE(MIN) ASSUME THE Y=? VAR Y=0 OCH FÖR ATE(MAX) ASSUME THE Y=? VAR Y=1
ATE(MIN) = 0,6 - (0,2 + 0,2)
ATE(MAX) = (0,6 + 0,2) - 0,2
ATE (ONE-ON-ONE MATCHING)
[E(Y I D=1, X=x) - E(Y I D=0, X=x)] x P(X=x)
CONDITIONAL INDEPENDENCE
E(Y0 I D=0, X=x) = E(Y0 I D=1, X=x)
OCH
E(Y1 I D=1, X=x) = E(Y1 I D=0, X=x)
GER MATCHING ESTIMATOR:
[E(Y1 I D=1, X=x) - E(Y0 I D=1, X=x)] x P(X=x) = ATT
SIMPLE REGRESSION
B0 + B1 x Di + ri
B0 = (Y0 I D=0)
B0 + B1 = E(Y1 I D=1)
B1 = E(Y1 I D=1) - E(Y0 I D=0) = NATE
FIXED EFFECTS MODEL
B0 + Bfe x (Dit + mean of Di) + B2 x (Xit + mean of Xi) + (Wit + mean of Wi)
EX, FOR i=1 MEAN OF Y = 8. Y YEAR 1 = 8, Y YEAR 2 = 9, Y YEAR 3 = 7
Y(FE) = 8 - 8 = 0 FOR YEAR 1
Y(FE) = 9 - 8 = 1 FOR YEAR 2
Y(FE) = 7 - 8 = -1 FOR YEAR 4
FIRST DIFFERENCE MODEL
B0 + Bfd x (Dit + D(it - 1)) + B2 x (Xit + X(it-1)) + (Wit + W(it-1))
EX, FOR i=1 MEAN OF Y = 8. Y YEAR 1 = 8, Y YEAR 2 = 9, Y YEAR 3 = 7
Y(FD) = - CANNOT COUNT THE FIRST YEAR
Y(FD) = 9 - 8 = 1 FOR YEAR 2
Y(FD) = 7 - 9 = -2 FOR YEAR 4
DIFFERENCE-IN-DIFFERENCE MODEL
Bdd = [E(Ypost I D=1) - E (Yante I D=1)] - [E(Ypost I D=0) - E (Yante I D=0)]
RDD (SHARP)
ATEsrd = E(Y1 I D=1, X=C) - E(Y0 I D=0, X=C)
LATEfrd
[E(Y I Z=1, X=c) - E(Y I Z=0, X=C)] / [E(DI Z=1, X=C) - E(D I Z=0, X=C)]
INSTRUMENTAL VARIABLES
Z > D > Y
CALCULATE % OF COMPLIERS, ALWAYSTAKERS AND NEVERTAKERS
[Y(D=1, Z=1) / [Y(D=1, Z=1) - Y(D=0, Z=1)]] - [Y(D=1, Z=0) / [Y(D=1, Z=0) - Y(D=0, Z=0)]]
Y(D=1, Z=1) = 865
Y(D=0, Z=1) = 1915
Y(D=1, Z=0) = 1372
Y(D=0, Z=0) = 5948
[865 / (865-1915)] - [1372 / (1372-5948)] = 0,331 - 0,188
= 0,123
COMPLIERS: 0,123
ALWAYSTAKERS: 0,188 [Y(D=1, Z=0) / [Y(D=1, Z=0) - Y(D=0, Z=0)]
NEVERTAKERS: 1- (0,123 + 0,188) [I - (COMPLIERS + ALWAYSTAKERS)]
PATE
E(Y I D=1, S=1,Z=z) - E(Y I D=0, S=1, Z=z)
SAMPLE POPULATION EDUCATION (HI: 0,5, LOW: 0,5)
TARGET POPULATION EDUCATION (HI: 0,2, LOW: 0,8)
AVARAGE EFFECT RESULTS IN SAMPLE (HI: 2, LOW:6)
PATE= (0,2 x 2) + (0,8 x 6)
SUTVA
ASSUMPTION IN CAUSAL INFERENCE
- NO SPILLOVER EFFECTS
UNIT HOMOGENITY ASSUMPTION
Y1i = Y1j, Y0i = Y0j. FOR ANY i = j
NATE = ATE
ATE = ITE
D IS THE ONLY CAUSAL VARIABLE THAT AFFECTS Y
INDEPENDENCE ASSUMPTION
TÄNK PÅ CONDITIONAL INDEPENDENCE
EXCLUSION RESTRICTION
IN IV
Z DOES NOT HAVE A DIRECT EFFECT ON Y, ONLY ON D
PROBLEMS WITH CAUSAL INFERENCE
We cannot observe both counterfactuals for the same person.
We only observe one. Therefore, we can also not directly observe the ATE
= E[Y1 − Y0].
Which graphical criterion implies independence of D and Y1, Y 0
BACKDOOR CRITERION
D-SEPERATION
ALL PATHS BETWEEN TWO VARIABLES ARE CLOSED
We can deduce from a graph what correlations in the data are implied
We can find testable implications of our assumptions
Conditions needed for a perfect randomized expiriment
Random sample, randomization of treatment, large N, subjects need to comply, subjects should not drop out, perfect measurement
Limits of randomized expiriments
Costs (of performing equivalent randomized experiments to test each treatment of interest may be prohibitive)
Estimates based on results may be delayed for years
Ethical concerns (Harmful treatments, deception)
Realism and size of study population in field experiments (ethical concerns)
Some variables cannot be manipulated
Randomization is infeasible if we are interested in the effects of particular
events in the past
Random experiments break: noncompliance, attrition
Conditional randomization
Conditional randomization means that we form groups of units with similar X and then actually randomize D within these groups.
Matching
Matching is a data analysis algorithms that finds observations
with same/similar X, but different treatment D.
Under which assumptions are the IV equal to LATE?
Exclusion restriction (no direct effect on Y ), (as-if) random
assignment (no back-door paths from Z to D or Y ), no defiers.
MMD
Mills method of observed difference
Looking for cause of effects. Look for cases with different outcomes and look for the cause of the difference.
POF
Potential outcomes framework
Looking for effects of causes. Look for cases with different D (causes) and look at the difference in outcome.
In-time placebo
Apply method to dates when the intervention
did not occur (e.g., change dataset so that Germany was unified in 1980
instead of 1991, and estimate the “effect”)
In-space placebo
Re-assignment of the intervention to control units (e.g.,
change dataset so that Italy was unified in 1990, while Germany was not,
and estimate “effect” on Italy)
LATE assumptions
- Relevance (Z creates variation in D)
- Exogeneity (Z is randomly assigned)
- Exclusion restriction (Z affects the outcome only through D)
- Monotonicity (There are no defiers)
ATE om inte vet atc eller ett
[E(Y1 I D=1) - E(Y0 I D=1)] - [E(Y1 I D=0) - E(Y0 I D=E)] x ANTALET OBSERVATIONER X/N
BASICALLY DETTA E
[E(Y I D=1, X=x) - E(Y I D=0, X=x)] x P(X=x)
Assumptions for POF
Exchangability: of participants included in the study and members of the target population, possibly conditional on pre-treatment charasteristics of Z
Positivity: treatment posibilites in sample population is greater than 0 within all strata of Z (does not have to be 1, or equal for all subjects)
No interference: within the study and the target population
ATE när vi saknar counterfactuals (och har ett x-värde)
[(E(Y1 I D=1, X=1) / E(Y0 I D=0, X=1)] x X/N )+ [E(Y1 I D=1, X=0) - E(Y0 I D=0, X=0)] x X/N