Lecture 12 (DiD extensions) Flashcards

1
Q

Summarize the problem with staggered TWFE. What is the problem, what do we get etc.

A

We need to be careful extrapolating from the very basic 2x2 DiD setup to more general cases. Recently econometricians have revealed serious problems with applying TWFE to setting with staggered treatment adoption. The main source of issues comes from the use of already treated units as controls in presence of dynamic treatment effects. We will not estimate ATT, but a weighted ATT + weighted non parallel trend bias + heterogeneity bias. Even under randomization we can get an opposite sign of the true effects.New estimators for different versions of staggered DiD have been invented and are easy to implement for practitioners.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the real assumptions we need to make in the staggered TWFE setting?

A

The parallel trends assumption as we have formulated it is not enough for TWFE to be unbiased when treatment adoption is described by differential timing. TWFE assigns weights that are a function of sample sizes of each comparison group and the variance of the treatment dummies for those groups.

The real TWFE assumptions are:

  • The variance weighting parallel trends are zero
  • No dynamic treatment effects

Under these assumptions, the TWFE estimator estimates the variance weighted ATT as a weighted average of all possible ATTs.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What comparisons do we do in TWFE with differential timing?

A

TWFE with differential timing uses treated groups as controls (as opposed to some other estimators) and this leads to trouble.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the comparison groups in a TWFE with different timing? What group is problematic?

A
  • Early treated vs untreated (never treated)
  • Late treated vs untreated (never treated)
  • Early treated vs late treated
  • Late treated vs early treated (already treateds)

What we do when we create our estimates is that we weight our four 2x2 DD.

The last group is the problematic one. The forbidden.
If the treatment effect for the early treated are dynamic, it will contaminate the parallel trend and using them as a contra factural for the late group will create the problems for our estimates.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are the take away messages regarding the weights in the TWFE with differential timing?

A

TWFE assigns weights that are a function of sample sizes of each group and the variance of the treatment dummies for those groups. Goodman-Bacon (2021) shows that TWFE estimates a parameter that is a weighted average of all 2x2 DiDs in your sample.

The TWFE estimates yields a weighted combination of each groups respective 2x2.

Main takeaways:

  • More units in a group the bigger its 2x2 weight is
  • More treatment variance weights up or down a groups 2x2
  • The largest weights are for group treated in the middle of our panel
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the treatment effect that we estimate in the TWFE with differential timing?

A

In the already treated and late treated group we will get

ATT + Non parallel trend bias + Heterogeneity bias.

The DD-estimate thus consists of

  1. A variance weighted ATT
  2. A variance weighted non-parallel trends bias
  3. the bias from treatment effect dynamics
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What do we need to do if we don’t have a untreated group in or data?

A

If we do not have untreated groups in our data, we need to drop the last periods in our sample to be able to identify the effect. That is, we force the last treated to become untreated.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly