F7 Matching and subclassification Flashcards
What is Jan’s strategy?
Try different types of matching and see whether the result hold. You want as many matches for a treated unit as possible
What is sub-classification and problems with this method?
Summing over weighted differences between control and treatment group. Weighting differences in means by strata-specific weights.
Problem: Curse of dimensionality (difficult with large datasets). Becomes an obsolete method.
Solution: Collapse groups (do not imputate data)
What is exact matching?
Match exactly on all confounders (high demands for data - discrete variables needed).
You quickly run into curse of dimensionality.
There are two kinds of approximate matching. What are they?
Approximate matching is a method, where you minimize the distance between control and treated unit.
Nearest neighbor
Propensity score matching
What is nearest neighbor matching?
We minimize the sum of the distance between all confounding variables.
To account for different scaling the normalized Euclidean distance is used. This accounts for greater dispersion on some variables.
What is propensity score matching?
Confounders are collapse into a single dimension - a propensity score [0,1] indicating your probability of being in the treatment group conditional on confounders.
So PSM reduces the problem of finding a suitable match to one dimension: p(X)=Pr(D|X).
What are two important assumptions for matching?
Common support: At all values of the propensity score/relevant range, we want units in both the control and treatment group.
Conditional independence assumption: Conditional on X (confounders) then Y^1 and Y^0 are independent from treatment D. Once we factor out confounders.
What are advantages of propensity score matching?
No curse of dimensionality.
It’s possible to check the significance level of confounders - whether the matter or not.
What does Jan think about matching?
It can be succesfuld. It’s superior to multivariate regression as you consult the common support assumption. Especially with exact matching.
What is CIA?
Conditional independence assumption.
With a cross-sectional dataset we are forced to assume that D is independent of potential outcomes once we condition on covariates
What does garbage in, garbage out mean?
Quality of propensity score/matching is intimately linked to quality of covariates
What is one advantage of matching?
It casts light on important concepts such as common support! Normally hidden in regression.
What are criticsm of matching?
Common support means you reduce the sample size (variance trade-off).
Propensity score matching: You can have 0,8 for very different reasons.
The larger the distance from match - the more bias (the larger the sample size it converges to zero). You could use the bias-correction estimator.
Unobserved confounders.