L5: IV regression Flashcards

1
Q

When might we use instrumental variables? (3 and what these issues have in common)

A

1) OVB from a variable that is correlated with X but is unobserved (tf cannot be incl. in regression eqn.)
2) Simultaneous causality bias (ie. X causes Y AND Y causes X)
3) Errors-in-variables bias (X is measured with error)

All 3 problems -> E(u|X) not equal to zero

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What does IV regression do?

A

Eliminates bias when E(u|X) not equal to zero, using an instrumental variable, Z

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are endogenous and exogenous variables?

A

Endogenous - a variable correlated with u

Exogenous - a variable not correlated with u

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the two conditions for a VALID INSTRUMENT?

A

1) Instrument relevance: corr(Zi, Xi) /=0

2) Instrument exogeneity: corr(Zi, ui) = 0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Explain carefully how to estimate when using an IV?

A

2 stage least squares:
1) ISOLATE part of X that is uncorrelated with u by regressing X on Z using OLS:
EQN: Xi= π0+ π1Zi+ vi
Because Zi is uncorrelated with ui, π0+ π1Zi is also tf so is Xi! From here, we then compute predicted values of Xi, where: Xi(hat)=π0(hat)+ π1(hat)Zi

2) Replace Xi by Xi(hat) in the regression of interest, and regress Y on Xi(hat) using OLS:
ie. Yi=B0+B1Xi(hat)+ui

Since Xi(hat) is uncorrelated with ui, E(u|X(hat))=0 tf it works! (Then can estimate B1(hat)(TSLS))

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What does 2SLS require?

A

n to be large so π0 and π1 are estimated precisely

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Show that the 2SLS estimator is equal to the ratio of the covariances: S(YZ)/S(XZ)

A

see notes bottom page 1 side 1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Is the 2SLS estimator consistent?

A

YES see notes for why (ie. both the sample covariances are consistent tf the estimator tends with probability to true value of B1)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is inference like using TSLS?

A

Same as usual

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Why are OLS standard errors from the 2nd stage regression wrong?

A

They do not take into account the estimation of the first stage where Xi(hat) is estimated (stata can solve this with a command that computes the TSLS with corrects SEs) (HTSK-robust SEs)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Why would a regression that relates quantity (Y) to price (X) likely suffer from bias? What type of bias would this be?

A

This regression only gives equilibrium point at the crosssover of S and D, but when collecting data in a market only get price and quantity at equilibrium tf no D and S function and tf this gives rise to simultaneity bias (ie. change in D causes change in Quantity supplied and vice versa?)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

See

A

cigarette demand example in notes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

See

A

General IV regression model notes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the problem in the generalised IV regression model with adding more IVs?

A

see notes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Explain the three cases of identification relevant to 2SLS? When can 2SLS be done?

A

Exact identification if m=k
Underidentified if m less than k
Overidentified if m>k

Can only be done with exact/overidentification - where m is number of IVs and k is number of ENDOgenous regressors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

See notes

A

Bottom of side 2 check I understand how to do TSLS with a single endogenous regressor (X) and multiple exogenous regressors (W1…Wi) (go over cig example too!)

17
Q

If you have 2 suitable IVs, Z1 and Z1, that are both correlated with the endogenous variable and uncorrelated withe error, which should you use and why?

A

BOTH!
regress the endogenous variable on both Z1 and Z2 - this is a case of overidentification and therefore will reduce the SE of the results (so long as additional IVs are appropriate): more information -> BETTER ESTIMATES!

18
Q

Explain under what assumptions does TSLS hold and its t-statistic is normally distributed?

A
  1. E(ui|W1i,…,Wri) = 0 the exogenous regressors are exogenous.
  2. (Yi,X1i,…,Xki,W1i,…,Wri,Z1i,…,Zmi) are i.i.d
  3. The X’s, W’s, Z’s, and Yhave nonzero, finite 4th moments
  4. The instruments (Z1i,…,Zmi) are valid (ie. Corr(Zmi,ui)=0 and Corr(Zmi,Xi)=/0 for m=1 to M)
19
Q

In MRM generalised IVs, when are instruments said to be relevant? And when are they said to be weak?

A

In the first stage, if at least one π is not equal to zero then the instruments are relevant
If they are all equal to zero (or v. close to zero) the instruments are weak

20
Q

What do weak instruments do?

A

They explain very little of the variation in X BEYOND what is explained by the W’s

21
Q

What is a consequence of IVs being weak?

A

TSLS sampling distribution and t-stat are not at all normal, even when n is large!

(Why? Because makes S(XZ) v small tf beta1(hat)TSLS becomes very large!) (ie. no correlation between X and Z and tf Z does not explain X tf Z does not explain Y either!) (see notes bottom of S2P2 and top of S1P3)

22
Q

How do you test instrument strength?

A

F-test that tests that all the coefficients on Z1,…,Zm DO NOT ENTER first stage regression (ie. are all equal to zero)
Rule of thumb: if F-stat is less than 10 then the set of instruments is weak! (tf -> biased 2SLS)

23
Q

What does comparing to F=10 actually allow us to do?

A

Compare if the bias (relative to OLS) is greater or less than 10% (IF F is less than 10, bias is more than 10% and vice versa!!!)

24
Q

2 solutions to weak instruments?

A

1) Find better instruments/drop ones you think may be weak

2) Use other estimators (can be very complicated though)

25
Q

What criteria must be fulfilled to test for instrument exogeneity? What is the consequence for TSLS if this assumption does not hold?

A

Criteria: the model must be overidentified to do this test!

If the assumption of instrument exogeneity fails, then TSLS is INCONSISTENT!

26
Q

When to use J-test of overidentifying restrictions?

A

If given say 2 IVs, Z1 and Z2, and computer TSLS for both and the estimates for beta are very different, then know that one of Z1 or Z2 must be invalid

27
Q

See

A

bottom of p2s2 on how to conduct a J-test

28
Q

What are the hypotheses for a J-test?

A

H0: All instruments are exogenous
H1: At least one instrument is not exogenous

29
Q

J-statistic distribution? How many DofF in a J-test?

A

Chi-squared, with m-k DofF

30
Q

Why must the model be overidentified to do a J-test?

A

Because otherwise the DofF, m-k, will equal 0!

31
Q

What does it mean if the actual J statistic is in the critical region?

A

Means that H0 is rejected because there is at least one endogenous IV

32
Q

Summary?

A

Slides 38 and 39 if needed

33
Q

See

A

S3P3 in notes on cig demand bit

34
Q

How can we interpret the J-test rejection?

A

Can use intuition to try work out which variable(s) is/are endogenous, then redo the model and try again

35
Q

What points need to be considered when assessing the validity of a study?

A

1) OVB
2) Function form misspecification
3) Simultaneous causality bias
4) Errors-in-variables bias
5) Selection bias (ie. have all states been used or just some???)
6) Are IVs truly relevant and exogenous
7) Old data: if using old data need to consider if it is externally valid to apply it to today’s problems