DiD & Synthetic Controls Flashcards

1
Q

What are the main identifying assumption in the DiD setting?

A
  • Assumption 1: The treatment effect has an additive structure.Specifically, the potential outcome process in
    absence of treatment evolves as:$$
    y^0{it}=\eta_i+\delta_t+\epsilon{it}
    $$This is a technical assumption which makes us describe the different scenarios in terms of the coefficients.This assumption ensures that difference in expectations = ATE + bias.
  • Assumption 2: Parallel trends
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is explicitly the Parallel trend assumption

A

In the absence of treatment, the potential outcomes in the treated and control groups would evolve in parallel. This assumption ensures that bias = 0 so are did estimate = ATE.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Use potential outcome framework to state the DID ATT + selection bias

A

See econometrics 1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What problem do synthetic controls resolve?

A

The problem of choosing the right control group.

One could use some type of matching to get a better control group. Then, you can run simple DiD on the matched sample. Exact matching on characteristics seems like a good idea, but exact matching with many covariates is impossible.

The synthetic control method helps us choose a control group in a transparent, non-subjective way.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Describe the idea behind synthetic control method

A

The synthetic control method helps us choose a control group in a transparent, non-subjective way. Useful when no natural control group exists.

This is a method that construct an artificial counterfactual that serves as control group by taking averages of existing entities.

The idea behind synthetic control is to weight units in the control group so that a “synthetic” unit matches the treated unit based on $Z_i$ and the outcome pre-treatment, $yit$. As a result, the synthetic control unit and the treated unit will be aligned by construction, and diferences post-treatment are interpreted as the (dynamic) treatment effect.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are the identifying assumption in synthetic control

A

Fattar inte

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Hur är synthetic control konstruerad

A

We construct the synthetic control as a weighted average of the control units. A different set of weights will give rise to a different SC.

If conditions are met, the “synthetic control replicates the missing counterfactual.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the unbiased synthetic control estimator?

A

\hat \alpha_{1t} = y_{1t}-\sum_{j=2}^{J+1}w^*jy{jt}

Alpha is the parameter of interest (usually we use beta).

The estimator is simply the difference between the observed outcome and the synthetic control.

SC is thus a generalisation of the DiD model to the case in which the unobserved cofounders are allowed to vary over time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What do we mean by overfitting in the case with synthetic control (SC)

A

Using many pre-treatment variables leads to a better control, but might also lead to overfitting! It could instead be better to use a summary index of the variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How does SC compare to a weighted regression? Where we just give different control groups different weights.

A

Using weighted regression, we can get negative regression weights. This is a problem that we avoid when using synthetic controls. Here weights are 0 at the lowest. We thus avoid extrapolation. It is also a more transparent way of choosing weights.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What type of inference do we do with SC?

A

Large sample asymptotic inference is not possible with SC. We instead use a permutation method for inference to calculate a p-value.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

When is the SC estimator consistent?

A

When T \to \infin

So we need a relatively
large pre-treatment window.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Describe the permutation method used for inference in SC

A

1.Estimate a “placebo” treatment efect for each unit in the
“donor pool” via SC methods: California goes into the
control pool and each of the 38 states becomes treated in
turn.

  1. Calculate an empirical p-value for the efect estimated on
    the treatment unit
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is one way to measure fit for SC?

A

Sometimes, SC fails to find a good synthetic control, so that the match pre-treatment is bad. One way to measure it is by calculating the mean squared prediction error (MSPE) pre-treatment for each unit and throw away those that has a MSPE higher than 2, 5 or 10 times that of the treated unit. To avoid choosing this thresehold arbitrarily, one could also calculate the MSPE for the post period and look at the post/pre MSPE ratio distribution.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are the pros and cons of SC?

A

Pros:

  • No extrapolation: all weights are non-negative and sum to one.
  • Contrary to DiD, it allows for time-varying individual heterogeneity.
  • It is a data-driven approach to construct counterfactuals, and allows less degrees of freedom to researchers.
  • Works even when there is just one treated unit.
  • Allows nice graphical analysis.

Cons:

  • Hard to distinguish between a random shock to treated
    unit and a treatment efect if outcome is very volatile.
  • Need to ind suitable donor pool
  • Need a large pre-treatment window (and possibly post-) for the matching to work well.
  • Asymptotic results are not available, need to use permutation methods for inference (s.e., testing).
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the problem with event-designs`

A

Estimating an event design has shown to not be a good idea (the general “new” critique). The problem is that with staggered treatment, OLS sometimes chooses negative weights, which is weird! This problem doesn’t occur only if we assume: i) all individuals follow parallel trends in absence of treatment, and ii) treatment efects are the same for all individuals. They way to solve this is to use some of the newly proposed estimators.

Luca also mentions that there is a problem with pre-trend testing and issues arising from binning of the end points