Endogeneity & Instrumental Variables Flashcards

Question 1

Q

In general, when can Xi be endogenous (upon what assumptions)

B) what does this mean for our estimate of β₁ (β^₁)?

Answer

A

Usually zero conditional mean assumption exists
E(εi|Xi) = 0

This is violated if Xi and εi covary! i.e
Cov (Xi,εi) ≠ 0
(E.g high X associated with high ε)

B) Then Xi is endogenous!!!!
So E(β^₁) ≠ β₁
So biased!

Question 2

Q

Thus omitted variables are a source of endogenity

E.g we estimate
Yi =β0 +β1 Xi +εi

But omit relevant variable Zi

When would endogenity occur

Answer

A

When Cov(Xi,Zi) ≠ 0 and β₂ ≠ 0 ,
(Usual bias for omitted variables)

Cov (Xi,εi) ≠ 0 , so endogenous!!!

Question 3

Q

Measurement errors are also a source of endogenity:

Explain (Assume independent variable with error!)

Answer

A

Do normal steps of estimating β₁ , we get
Yi = β₀ + β₁X*i + εi − β₁ui
Where εi − β₁ui = μi (measurement error and normal error)

Since Cov(Xi,ui) ≠ 0 , Cov (Xi,μi) ≠ 0
So endogeneity

Question 4

Q

3rd source of endogeity - autocorrelation

Consider a time series model with a RHS lagged dependent variable: Yt =β₀ +β₁Yt−1 +εt

With error term
εt = ρεt−1 + νt

How can we establish endogeneity

Answer

A

It must be the case
Cov(Yt-1,εt-1) ≠ 0 (variable in past is linked to past error)
Cov(εt,εt-1) ≠0 (error is linked to past error)

Therefore
Cov(Yt-1, εt) ≠ 0
(Independent variable is correlated with error!)
So endogeneity!

Question 5

Q

4th source of endogeneity - Simultaneity (reverse causality)

What is meant by this and example

Answer

A

Y is a function of X , and X is a function of Y

E.g demand (price depends on quantity, and quantity depends on price)

Question 6

Q

So 4 sources of endogeneity

Answer

A

Omitted variable
Measurement error
Autocorrelation
Simultaneity

Question 7

Q

Main reason endogeneity is bad

Answer

A

OLS estmimates are bias

Question 8

Q

How to solve endogeneity

Answer

A

Break correlation between Xi and ε - isolate the part of the variation within Xi that is uncorrelated with the error!

using an instrumental variable!

Question 9

Q

2 assumptions we make for instrumental variables

Answer

A

Instrument relevance (instrument is highly correlated with X, and relevant for variation in X)

Instrument exogeneity - instrument is uncorrelated with εi

Question 10

Q

Simple model Yi =β₀ +β₁ Xi +εi

But Cov(Xi, εi)≠0 endogeneous, so a bias β₁ estimate

Now add instrumental variable Zi

How can we use instrument to identify β₁?

Answer

A

Take covariances with respect to Z on both sides

Yi =β₀ +β₁ Xi +εi Turns into
Cov(Zi,Yi) = Cov(Zi,β₀) + Cov(Zi,β₁Xi) + Cov(Zi,εi)

Then since instrumental variable is uncorrelated with error and β₀:
Cov(Zi,Yi) = β₁Cov(Zi,Xi) (and take β₁ out bracket)

Finally solve for β₁
β₁ = Cov(Zi,Yi)/Cov(Zi,Xi)

Question 11

Q

What does this suggest

Answer

A

A sample analogue

B^₁IV : instrumental variable estimator of β₁

βˆ₁IV = Σ(Zi −Z ̄)(Yi −Y ̄)/Σ(Zi −Z ̄)(Xi −X ̄)

Question 12

Q

β^₁IV : properties

Answer

A

Consistent, but may be bias in small samples (unbias with large samples

Question 13

Q

Standard errors of IV and OLS

Answer

A

SE of IV is greater than OLS i.e less efficent (however still good since turned bias into unbias!)

OLS: √σ^²/TSSx

IV: √σ^²/TSSx x R²zx

R²zx is < 1!

Question 14

Q

So instruments are useful in turning endogeneous/bias estimates into non bias i.e β^₁IV

How can we optain Biv in practice? (2)

Answer

A

Sample analogue - as shown (for 1 instrument and one endogenous variable)

Two stage least squares (2SLS) for multiple variables or instruments

Question 15

Q

4 equations we use
Basic model Yi =β₀ +β₁ Xi +εi
First stage Xi = γ₀ + γ₁Zi + vi (since first thing we estimate)
Reduced form Yi = λ₀ + λ₁Zi + ui
Second stage Yi = β₀ + β₁X^i + vi (fitted value from first stage)

Steps to 2SLS - first stage

Answer

A

Estimate “first stage” equation by OLS and get fitted values
X^i = γ^₀ + γ^₁Zi

Hypothesis test
H₀: γ₁ = 0 (Z is relevant)
H₀: γ₁ ≠ 0 (Z relevant!)

Note: critical region is t>3.16 , higher than normal 1.96 (need strong instruments that can reject and be relevant!)

Question 16

Q

Second stage of 2SLS

Answer

A

Estimate second stage regression
Yi = β₀ + β₁X^i + vi (uses X^i , the fitted value from first stage)

Then we can estimate β^₁ 2SLS , now a consistent estimate!

And with one endogenous regressor e.gX ,and one instrument..
β^₁ 2LS = β^₁IV (same as the sample analogue)

Question 17

Q

2SLS with multiple variables and instruments e.g suppose a model

Yi =β₀ +β₁X₁i +β₂ X₂i +…+βkXki +εi

With m instruments
Where X₁ is endogeneous , the rest are exogenous.
What would the first step and second step equations be?

Answer

A

X₁ is a problem, as endogeneous i.e correlated with error cov(X₁,εi) ≠ 0, so focus on that

First stage equation
X₁ = γ₀ +γ₁ Z₁i +…+ γmZmi + ø₂ X₂i +…+ økXki + νi

(So ø for the well behaved exogenous variables, and γ for the instruments)

Second stage equation
Yi =β₀ +β^₁X₁i +β₂ X₂i +…+βkXki +εi

So same as basic model but with X^₁

Question 18

Q

What if R²zx is small?

Answer

A

Variance of IV estimator bigger than OLS estimator

Recall equations √σ²/TSSx and √σ²/TSSx R²zx

So a disadvantage, but makes up for it as removes endogeneity and bias. If an INSTRUMENT is poor, worse!

Question 19

Q

So if instrument poor, large issues. Why?

Answer

A

Use probability limit of IV estimator to show:
plimβ^₁IV = β₁ + Cor(Z,ε)/Cor(Z,X) x σε/σX

If instrument is weak i.e 1st assumption cov(Zi,X) is small, even small violations of the 2nd assumption cov(Z,ε)=0 creates large inconsistencies!

(So if Z has low instrument relevance (not highly correlated with X, and slightly correlated with the error, breaking instrument exogeneity assumption = large inconsistencies)

Question 20

Q

How to test for endogeneity (so far we just assumed the variable is endogeneous)

Answer

A

Durbin-Wu-Hausman

Question 21

Q

Durbin Wu Hausman test for endogeneity

Setup same as 2SLS: suppose model

Yi =β₀ +β₁X₁i +β₂ X₂i +…+βkXki +εi

With m instruments
Where X₁ is endogeneous , the rest are exogenous.

Answer

A

Obtain residuals v^i from first stage equation.
v^i = X₁i - [γ^₀ +γ₁^ Z₁i +…+ γmZmi + ø₂ X₂i +…+ økXki]

Then include into our original model (add πv^i)
Yi =β₀ +β₁X₁i +β₂ X₂i +…+βkXki +εi + πv^i

Hypothesis test (t test)
H₀:π=0 (X is exogenous)
H₁:π≠0 (X₁ is endogenous, Xi correlated with ε)