L5: IV regression Flashcards

Question 1

Q

When might we use instrumental variables? (3 and what these issues have in common)

Answer

A

1) OVB from a variable that is correlated with X but is unobserved (tf cannot be incl. in regression eqn.)
2) Simultaneous causality bias (ie. X causes Y AND Y causes X)
3) Errors-in-variables bias (X is measured with error)

All 3 problems -> E(u|X) not equal to zero

Question 2

Q

What does IV regression do?

Answer

A

Eliminates bias when E(u|X) not equal to zero, using an instrumental variable, Z

Question 3

Q

What are endogenous and exogenous variables?

Answer

A

Endogenous - a variable correlated with u

Exogenous - a variable not correlated with u

Question 4

Q

What are the two conditions for a VALID INSTRUMENT?

Answer

A

1) Instrument relevance: corr(Zi, Xi) /=0

2) Instrument exogeneity: corr(Zi, ui) = 0

Question 5

Q

Explain carefully how to estimate when using an IV?

Answer

A

2 stage least squares:
1) ISOLATE part of X that is uncorrelated with u by regressing X on Z using OLS:
EQN: Xi= π0+ π1Zi+ vi
Because Zi is uncorrelated with ui, π0+ π1Zi is also tf so is Xi! From here, we then compute predicted values of Xi, where: Xi(hat)=π0(hat)+ π1(hat)Zi

2) Replace Xi by Xi(hat) in the regression of interest, and regress Y on Xi(hat) using OLS:
ie. Yi=B0+B1Xi(hat)+ui

Since Xi(hat) is uncorrelated with ui, E(u|X(hat))=0 tf it works! (Then can estimate B1(hat)(TSLS))

Question 6

Q

What does 2SLS require?

Answer

A

n to be large so π0 and π1 are estimated precisely

Question 7

Q

Show that the 2SLS estimator is equal to the ratio of the covariances: S(YZ)/S(XZ)

Answer

A

see notes bottom page 1 side 1

Question 8

Q

Is the 2SLS estimator consistent?

Answer

A

YES see notes for why (ie. both the sample covariances are consistent tf the estimator tends with probability to true value of B1)

Question 9

Q

What is inference like using TSLS?

Answer

A

Same as usual

Question 10

Q

Why are OLS standard errors from the 2nd stage regression wrong?

Answer

A

They do not take into account the estimation of the first stage where Xi(hat) is estimated (stata can solve this with a command that computes the TSLS with corrects SEs) (HTSK-robust SEs)

Question 11

Q

Why would a regression that relates quantity (Y) to price (X) likely suffer from bias? What type of bias would this be?

Answer

A

This regression only gives equilibrium point at the crosssover of S and D, but when collecting data in a market only get price and quantity at equilibrium tf no D and S function and tf this gives rise to simultaneity bias (ie. change in D causes change in Quantity supplied and vice versa?)

Question 12

Q

See

Answer

A

cigarette demand example in notes

Question 13

Q

See

Answer

A

General IV regression model notes

Question 14

Q

What is the problem in the generalised IV regression model with adding more IVs?

Answer

A

see notes

Question 15

Q

Explain the three cases of identification relevant to 2SLS? When can 2SLS be done?

Answer

A

Exact identification if m=k
Underidentified if m less than k
Overidentified if m>k

Can only be done with exact/overidentification - where m is number of IVs and k is number of ENDOgenous regressors

Question 16

Q

See notes

Answer

A

Bottom of side 2 check I understand how to do TSLS with a single endogenous regressor (X) and multiple exogenous regressors (W1…Wi) (go over cig example too!)

Question 17

Q

If you have 2 suitable IVs, Z1 and Z1, that are both correlated with the endogenous variable and uncorrelated withe error, which should you use and why?

Answer

A

BOTH!
regress the endogenous variable on both Z1 and Z2 - this is a case of overidentification and therefore will reduce the SE of the results (so long as additional IVs are appropriate): more information -> BETTER ESTIMATES!

Question 18

Q

Explain under what assumptions does TSLS hold and its t-statistic is normally distributed?

Answer

A

E(ui|W1i,…,Wri) = 0 the exogenous regressors are exogenous.
(Yi,X1i,…,Xki,W1i,…,Wri,Z1i,…,Zmi) are i.i.d
The X’s, W’s, Z’s, and Yhave nonzero, finite 4th moments
The instruments (Z1i,…,Zmi) are valid (ie. Corr(Zmi,ui)=0 and Corr(Zmi,Xi)=/0 for m=1 to M)

Question 19

Q

In MRM generalised IVs, when are instruments said to be relevant? And when are they said to be weak?

Answer

A

In the first stage, if at least one π is not equal to zero then the instruments are relevant
If they are all equal to zero (or v. close to zero) the instruments are weak

Question 20

Q

What do weak instruments do?

Answer

A

They explain very little of the variation in X BEYOND what is explained by the W’s

Question 21

Q

What is a consequence of IVs being weak?

Answer

A

TSLS sampling distribution and t-stat are not at all normal, even when n is large!

(Why? Because makes S(XZ) v small tf beta1(hat)TSLS becomes very large!) (ie. no correlation between X and Z and tf Z does not explain X tf Z does not explain Y either!) (see notes bottom of S2P2 and top of S1P3)

Question 22

Q

How do you test instrument strength?

Answer

A

F-test that tests that all the coefficients on Z1,…,Zm DO NOT ENTER first stage regression (ie. are all equal to zero)
Rule of thumb: if F-stat is less than 10 then the set of instruments is weak! (tf -> biased 2SLS)

Question 23

Q

What does comparing to F=10 actually allow us to do?

Answer

A

Compare if the bias (relative to OLS) is greater or less than 10% (IF F is less than 10, bias is more than 10% and vice versa!!!)

Question 24

Q

2 solutions to weak instruments?

Answer

A

1) Find better instruments/drop ones you think may be weak

2) Use other estimators (can be very complicated though)

Question 25

Q

What criteria must be fulfilled to test for instrument exogeneity? What is the consequence for TSLS if this assumption does not hold?

Answer

A

Criteria: the model must be overidentified to do this test!

If the assumption of instrument exogeneity fails, then TSLS is INCONSISTENT!

Question 26

Q

When to use J-test of overidentifying restrictions?

Answer

A

If given say 2 IVs, Z1 and Z2, and computer TSLS for both and the estimates for beta are very different, then know that one of Z1 or Z2 must be invalid

Question 27

Q

See

Answer

A

bottom of p2s2 on how to conduct a J-test

Question 28

Q

What are the hypotheses for a J-test?

Answer

A

H0: All instruments are exogenous
H1: At least one instrument is not exogenous

Question 29

Q

J-statistic distribution? How many DofF in a J-test?

Answer

A

Chi-squared, with m-k DofF

Question 30

Q

Why must the model be overidentified to do a J-test?

Answer

A

Because otherwise the DofF, m-k, will equal 0!

Question 31

Q

What does it mean if the actual J statistic is in the critical region?

Answer

A

Means that H0 is rejected because there is at least one endogenous IV

Question 32

Q

Summary?

Answer

A

Slides 38 and 39 if needed

Question 33

Q

See

Answer

A

S3P3 in notes on cig demand bit

Question 34

Q

How can we interpret the J-test rejection?

Answer

A

Can use intuition to try work out which variable(s) is/are endogenous, then redo the model and try again

Question 35

Q

What points need to be considered when assessing the validity of a study?

Answer

A

1) OVB
2) Function form misspecification
3) Simultaneous causality bias
4) Errors-in-variables bias
5) Selection bias (ie. have all states been used or just some???)
6) Are IVs truly relevant and exogenous
7) Old data: if using old data need to consider if it is externally valid to apply it to today’s problems