3.DD Flashcards

1
Q

Vad menas med selectron of observables?

A

Selection on observables: matching and regression (OLS)

We may not have a controlled experiment, but the treated group and the non-treated group differ only by a set of observable characteristics.

An assumption that justifies in those cases the causal interpretation of our estimates is called Conditional Independence Assumption or Selection on Observables.
X1 is independent of the population error term u conditional on the factors W’s. Both regression and matching require CMI

Thus, we need to observe the W’s (selection-on-observables) to get an unbiased and consistent estimate of β1. The choice of how to specify the CEF, E[Y|X], and its functional form are also key requirements. We therefore have to take a stance on the right specification (cf. choice of neighbourhood or peer group summary statistics). Economic theory may inform this choice.

Det handlar alltså att ta höjd och kontrollera för eller matcha för de sakar vi vet om.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

När kontrollerar man för ”selection on unobservables”?

Vilka antaganden för vi om våra confounders?

A

När man har paneldata och förljer folk över tid.

Det vi kontrollerar för är då saker som vi inte ser, men vi vet varierar över individer, men inte tid.

  1. Våra confounding variabler varierar inte över tid. Dvs Wit = Wi
  2. De är linjära.

Vi kan då kontrollera för ”individual fixed effect”. Egenskaper som varierar över individer men inte över tid.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Vilka två sätt använder man för att kontrollerar för ”individfixerade effekter?”

Controlling för unobservables.

A

First difference transformation (FD):

Här tar vi regressionen i aktuell tidsperiod och drar av regressionen i förra tidsperioden.
Skillkaderna mellan regressionerna blir då vpr first difference ekvation.
Konstanten som inte varierade över tid har då försvunnigt.

Fixed effect transformation (FE)
Demeaning, denna är svårare menar han..
taking the mean across time.

Har man bara två tidsperopoder är dessa två lika, annars skiljer de sig åt något.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Vad menas med strikt exogenitet och sequensiell exogenitet?

A

Strictly exogenous means the error term in period t is unrelated to any instance of the variable X; past, present, and future. X is completely unaffected by Y.

Sequentially exogenous means in which the error term is unrelated to past instances of the variable X. Past, pressent, not future.

”Sequential exogeneity (past and present) means that the regression specification has the right dynamic specification. For example, two lags of X are sufficient to capture the dynamic response of the treatment effect”

Importantly, the panel data approach requires strict exogeneity, i.e., the error term has mean zero, given all past, present and future values of X. E(ut |..X t+1, X t, Xt-1)=0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Ge exempel på strikt exogenitet formuleras med FD transformation

A

Yit-Yit-1=(Xit-Xt-1)-(uit-uit-1) where E[∆uit |∆Xit]=0

Present (contemporaneous) exogeneity: Cov(Xit,uit)=0 and
Cov(Xit-1,uit-1)=0

Past exogeneity: Cov(Xit,uit-1)=0

Future exogeneity: Cov(Xit-1,uit)=0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Yit =αi + βtXit + βt-1Xit-1 + βt-2Xit-2 +uit

Vad är t-1 och t-2? Vad fångar dessa ekvationen?

A

Laggarna. Effekten nästa och nästnästa period. Alltså den dynamiska responsen av behandlingseffekten.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Vilken kombination kan man inte ha beträffande fixerade effekter och laggar och leads?

A

Man kan inte ha fixed effect modell och en lag av Y variabeln, det blir bias då.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Vilken data nivå är DD generellt på?

A

På gruppnivå

In a difference-in-difference (DD) approach, the treatment occurs at the group level. Thus, DD is based on grouped-data regressions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Consider a micro (e.g., individuals) regression model Yig= a + bXg + vig

where xg is a discrete regressor, taking on g different values.

Vad löser det att gruppera och vikta?

A

The grouped-data, weighted by the cell size is identical to OLS on the micro data (standard errors
are of course different: but it solves the Moulton problem, i.e., outcomes are
correlated within groups:
Om man inte väger menar han att Beta inte blir indentisk. Väger man, är de indentiska.

If the regressor is discrete then the regressor defines the groups. The
grouped data regression is OLS because all the variation in Xg is only at the
group level. By aggregating Yig we are not changing anything about Xg. As a
result, we do not need to have micro data but only grouped-data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Hur kan man kolla ifall man har problem med endogen sampling eller en felspecificerad modell?

A

Man kör en vanlig OLS regression och en WLS på group level.

Man ska få samma beta. Får man intre det är det något fel.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Vilken nivå har vi confounding factors vid DD? Hur får vi bort dem?

A

In a DD approach, the confounding factors are at the group level. We can
control for unobservable (time-constant) factors at the group level by difference away the group fixed effect (or equivalently by conditioning on a grouped fixed effect)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Vad gör DD?

A

The DD requires at least two years of data in the form of pooled cross sections i.e., a new random sample is taken from the population each year or panel data i.e., observations on the same individuals, families, firms, cities, states, or
whatever, across time. We then divide the data into group/period means.

In the simplest possible setting DD there are four group/period means: the treatment
group before, the treatment group after, the control group before and the control group after (X=1 if treatment group, X=if control group, T=0 if period before,
T=1 if period after)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Vad är DD estimatorn?

Vad är slillnaden mellan RCT och DD?

A

The differences-in-differences estimator is the average change in y for those in the treatment group, minus the average change in y for those in the control group

Här jämför vi förändirng medan RCT jämför nivåer. Men tillåter det alltså att det är skilljader i medelvärde mellan grupperna.

Går också att skriva på regressionsspråk

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Vilka kontroll-variabler ska man inkludera i en DD?

A

Saker som varierar på grupp och tidsnivå.

Only controls at the group level Wgt is relevant for identification unless there is compositional bias, i.e., the sample of individuals before and after the treatment are not drawn from the same population (e.g., people move). Individual level covariates may then control for compositional changes. On the other hand, panel data without any attrition avoids compositional bias altogether since it the same individuals both before and after the treatment.

Kontroll variabler på individnivå kan inte förhindra OVB, men det kan göra OLS estimatorn mer precis..

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Vad är Key-identifying assumption för DD?

A

The key identifying assumption in a DD is that there is no interaction between the time and groups expect for the treatment under study, i.e., the treatment groups have similar trends to the control groups in the absence of treatment.

This is called the parallel trend assumption

Same outcom in growt rate before treatment

”Modellen har strikt exogenitet conditional on the unobserved fixed effec”

”Treatment and control groups have paralell trends (future exogeneity)”

No lagged and dependent variable and no feedback.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Vad menas med normalisering i en event studie?

A

Vilker år som är referensåret. Generellt året innan behandling.

17
Q

Vad menas med binning of end points?

A

Hur många leads och lags man ska ha med.

Tror man använder det uttrycker och visar det då man exkluderar perioder på data man faktiskt har med? Annars kan man ha en saturated model med allt?

18
Q

Vad gör man om man inkluderar en group specific time trend i en event study?

A

Tar hänsyn till skillnader i utveckling innann behandling.

Kan bara utföras om det är mer än 2 tidsperioder.

With this particular setup it is possible to relax the common trend assumption in a standard difference-in- differences design since a model that includes a region-specific time trend, the identification of the causal effect is based on sharp deviations from otherwise smooth trends, even where trends are not common. This type empirical strategy is essentially a type of a regression discontinuity design (RD) with time as the forcing variable.

19
Q

Vad är ett sätt att testa validiteten i en DD?

A

Another way of testing the validity of the DD is to use an additional outcome measure. Replace Y by another outcome that is not supposed to be affected by the treatment. If the DD using the other outcome is non-zero, then it is likely that the DD for the original outcome is biased as well.

20
Q

Vad ör en group year shock? Hur kommer man runt det?

A

Yigt= γg + λ t + θXgt + σgt + vigt
where σgt is a group-year shock, E[σgt]=0. Group-year shocks are bad news for differences-in-differences models. With only two groups and years, we have no way to distinguish the differences-in-differences generated by a policy change from the difference-in-differences due the group-year shock. We can think of the presence of σgt as a failure of the common trends assumption.

Det är alltså något mer som händer utöver manipulationen.

The solution to the inconsistency of the estimate of treatment effect θ induced by random shocks in differences in differences models is to have either multiple time periods or many groups (or both). With multiple groups and/or periods, we can hope that the σgt average out to zero.

21
Q

Vad händer om det är förstora skillnader i nivåer mellan behandling och kontrollgrupp pre treatment?

A

Man kan få skillnader beroende på om man använder log eller enheten som utfall.

When average levels of the outcome y are very different for control and treatment groups before the treatment, the magnitude or even sign of the DD effect is very sensitive to the functional form posited. Suppose you look at the effect of a training program targeted to the young

  • The unemployment level for the young decreases from 30% to 20%.
  • The unemployment level for the old decreases from 10% to 5%.

Because of the dramatic difference in pre-program unemployment levels (30% vs. 10%), it is difficult to assess whether the program was effective.

  • The DD in levels would be (30 - 20) - (10 - 5) = 10 - 5 = 5%
    Which suggests a positive effect of training on employment

-The DD in logs would be, [log(30) - log(20)] - [log(10) - log(5)] < 0.

22
Q

Vad menas med long-term response vs reliability trade off?

A

DD estimates are more reliable when you compare outcomes just before and just after the policy change because the identifying assumption (parallel trends) is more likely to hold over a short time-window. With a long time window, many other things are likely to happen and confound the treatment effect.

However, for policy purposes, it is often more interesting to know the medium or long-term effect of a policy change. In any case, one must be very cautious to extrapolate short-term responses to long-term responses.

Alltså ett problem med dynamiska modeller när vi ser långa trender.

23
Q

Vad menas med Targeting based on difference?

A

A pre-condition of the validity of the DD assumption is that the program is not implemented based on the pre-existing differences in outcomes. Example: “Ashenfelter dip”

It was common to compare wage gains among participants and nonparticipants in training programs to evaluate the effect of training on earnings. Ashenfelter and Card (1985) note that training participants often experience a dip in earnings just before they enter the program (which is presumably why they did enter the program in the first place).

Since wages have a natural tendency to mean reversion, this leads to an upward bias of the DD estimate of the treatment effect.

24
Q

Vad menas med Political economy i fallar med DD?

A

Endogenous change in policy due to governmental response to variables associated with past or expected future outcome. For example, a few high years of crime due to unusual circumstances may stimulate a crackdown. A subsequent reduction in crime after unusual years should not be taken to indicate an effective crackdown if a drop would have been expected anyway. The way to avoid the problems of endogenous change in policy is to know the circumstances surrounding the change.

25
Q

Vad menas med stabe unit treatment value assumption?

A

Det finns ingen spillover effect mellan enheter. T.ex mellan en stad och en annan. Alltså att kontrollgruppen påverkas av att behandlingsgruppen får behandling.

Min behanding påverkar inte andras utfall.

Detta är oftast violated.

26
Q

Vad händer om man har hetrogena effekter i sin DD?

A

OLS är problematiskt och man ska använda latence (?) estimator?

27
Q

Kan parallell trends bevisas?

A

Nej, det kan bara avslås.

Detta då antagandet om parallella trender gäller båda innan och efter behandling. Vi kan bara testa innan.

På samma sätt vet i i en RCT inte om allt vi inte ser är balanserat. Vi kan bara testa och visa att det vi observerar är balanserat.

Vi kan ALDRIG visa att identifikation är valid.