PSM Flashcards

1
Q

What is exact matching?

A

Comparing individuals for whom the values of x are identical

rarely an option in practice since it’s often difficult to find T and C groups with identical values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the purpose of matching?

A

To reproduce the treatment group among the non-treated

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What two conditions must be met to implement matching estimators?

A
  1. Conditional independence assumption (CIA): There exists a set x of observable covariates such that after controlling for these covariates, the potential outcomes are independent of T status
  2. Common support assumption: For each value of x, there is a positive probability of being both treated and untreated (you can find a treated unit to match with an untreated unit)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the CIA?

A

Conditional independence assumption (CIA): There exists a set x of observable covariates such that after controlling for these covariates, the potential outcomes are independent of T status

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is common support assumption?

A

Common support assumption: For each value of x, there is a positive probability of being both treated and untreated (you can find a treated unit to match with an untreated unit)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How is the CIA used to construct a counterfactual for the treatment group?

A

It implies that after controlling for x, the assignment of units to T is “as good as random”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What assumption does the CIA require?

A

That all variables relevant to the probability of receiving treatment may be observed and included in x

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Why is PSM called a “data-hungry” method?

A

You need a lot of data for this method

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the propensity score?

A

The probability that a unit in the combined sample of treated and untreated units receives the T, given a set of observed variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What does the propensity score theorem say?

A

You only need to control for the probability of treatment, because if conditional on x, Ti and (Y1i, Y0i) are independent, then conditional on the propensity score p(xi), Ti and (Y1i, Y01) are independent

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the formula for the ATE conditional on propensity score?

A

ATE conditional on propensity score=E[Y1i-Y01|p(xi)]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Three steps for estimating program impact using PSM?

A
  1. Estimate propensity score
  2. Choose matching algorithm
  3. Estimates impact of intervention with matched sample
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

True or false: Use flexible functional form to estimate propensity score

A

True–want to allow for possible nonlinearities in the participation model (i.e., include higher-order terms and interaction terms)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

With or without replacement-which is better?

A

Without replacement-can only be matched with one treated unit

Estimators are more stable if a number of comparison cases are considered for each treated case–ie usually should use replacement

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is nearest neighbor matching?

A

Individual from comparison group with closest propensity score is chosen–note that this can be done with or without replacement

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is radius matching?

A

Specify a caliper (maximum propensity score difference)

17
Q

Implication for bias and variance of reducing caliper?

A

Reduces the bias

Increases variance

18
Q

How do you implement kernel method?

A

Choose a kernel function, specify bandwidth parameter

Compare each treated unit to a weighted average of the outcomes of all untreated units, with higher weights placed on untreated units with scores closer to that of treated individual

19
Q

Implications for bias and efficiency of choosing only one neighbor for nearest neighbor matching?

A

Minimize bias by using most similar observation

Ignore information–>reduced efficiency

20
Q

Conventional method for calculating standard errors from PSM estimates?

A

Bootstrapping-sample from analysis sample with replacement, and replicate multiple times

21
Q

You need to be sure that measures to generate PSM score are not confounded with outcomes or anticipation of treatment–what types of measures should you use?

A
  • stable over time or
  • deterministic (ie age) or
  • measured before participation
22
Q

How to check specification of your model re CIA?

A

balancing tests (does the estimated propensity score adequately balance characteristics between T and C group units?)

23
Q

How to check specification of your model re common support?

A
  • visual inspection of densities of propensity scores
  • comparison test such as Kolmogrov-Smirnov
  • are there big differences between maxima and minima of density distributions?
24
Q

What are we doing when we use propensity score to calculate ATT?

A

For each propensity score, we calculate the difference in mean outcomes for the treated and untreated with that p(X)

We then take a weighted average of these over the different propensity score values

25
Q

Two advantages of PSM over regression that controls for x?

A
  1. Matching does not require assumptions about functional form (eg linear relationship)
  2. Regression runs risk of extrapolating onto a space where there is little common support
26
Q

5 requirements for covariate selection

A
  1. Choose x’s so that unconfoundedness holds
  2. Should be correlated with treatment (Di) and outcome
  3. Selection should be based on theory
  4. x’s should be measured before treatment and not affected by it
  5. x;s should not be too good at predicting treatment–we are relying on common support
27
Q

Implications for bias and standard errors of implementing nearest neighbor with replacement?

A

better matches–> possibly less bias

higher standard errors

28
Q

Implications for bias and standard errors of implementing nearest neighbor without replacement?

A

worse matches–>possibly more bias

lower standard errors

29
Q

Why would NN matching without replacement lead to lower standard errors?

A

Using more variation

30
Q

Formula for propensity score matching estimator for ATT?

A

E[Y(1)|D=1, P(x)] - E[Y(0)|D=0, P(x)]

(treated-untreated)–note that the second term subs in for the unobserved term that we really want to know, which is E[Y(0)|D=1, P(x)]

31
Q

stratification and interval matching

A

Paritions the common support into intervals (strata) and then calculates mean differences within these strata