Lecture notes 14 Endogeneity Flashcards
What is endogeneity?
When X is correlated with the error term.
Factors impacting both X and Y that are not included in the model.
What are the cases in which endogeneity could arise?
Measurement error
Economic Theory
Omitted variable bias.
What is the implication of endogeneity?
It causes the OLS estimates to be biased.
How does omitted variable bias cause endogeneity?
Two cases:
Misspecification
No observing important variable.
Eg
Y = B0 + B1X + B2X^2 epsilon
but we run Y = B0 + B1X + epsilon
the error term has the B2X^2 included in it which is correlated with the B1X1
So the expectation of epsilon given x is not zero.
How does not including important variables?
if a variable that is important to explain the regression is not included then it goes into the error term and causes bias.
How does measurement error cause endogeneity?
As in the survey u = X* -X
when we try and observe true X there is mismeasurement that when expanded biases as the error term has this extra part
and E(X | epsion) is not equaled to 0
How can economic theory cause endogeneity and what is an example of this?
What is a second example
-Cobb douglas F(K,L) are correlated with epsilon as firms with high A will have more K,L
Assume a demand function with price as a regressor
When demand goes up epsilon goes up which also causes price to go up.
What is endogeneity via simultaneity
When variables have a two way relationhsip at the same time .
How can one solve endogeneity?
Instrumental variable
What is an instrumental variable usually denoted by?
z
What are the two conditions:
Instrument relevance Z and X are correlated
Instrument exogeneity
Z and epsilon are not correlated.
What is an example of an Instrumental variable
How many instrumental variables do you need for endogenous variables?
At least as many IVS as endogenous variables.
What happens to the use of IV if the instrumental relevance is very weak?
What is the issue if the instrument exogeneity condition does not hold?
-It means the IV estimator is very imprecise.
-It means that the IV estimator is biased.
How do you do an IV regression and why is this?
Once you have found instrument that is relevant and exogenous:
Regress X = gamma 0 + gamma 1 Z
Then save X hat from this regression (the part of X that is not correlated with epsilon). As the instrument is not correlated with epsilon.
Y = beta0 +beta1 Xtilda
Thus we regress Y on X tilda to get beta0 and Beta 1 and these are called the IV estimators.
What is IV estimator also called?
Two stage least squares estimator.H
How do you deal with IV if you have a mix of exogenous and endogenous variables?
You include the other exogenous variables in the first stage equation with the instrument.
What is the difference between the OLS variance and the IV variance?
The variance of X hat is always less than the variance of X as X is made up of intercept, instrument and error term whilst x hat is only made of intercept and instrument.
Variance of OLS = sigma sqaured / sum of (X - Xbar)^2
Variance of IV sigma squared / Sum of (X hat - xhat bar)^2
As OLS is divided by something bigger the OLS estimator is smaller
What is the difference between OLS variance and IV variance
Therefore, variance of IV estimator is larger than the variance of OLS estimator
What happens to variance if instrumental relevance fails
this means sum of (Xhat - Xbar hat) = 0
so the IV estimator fails.
What is the statistical test for instrument relevance
How would you do it for more than one IV?
Do the first stage regression
X = gamma0 + gamma1 . Z1 + u
H0 : gamma1 is equaled to zero
H1 : gamma 1 is not equaled to zero
if t value > CV reject H0 and variable is relevant
You would do F-test for more than one IV
Industry bench mark is that if F-test is greater than 10 it has relevance.
How do we test the exogeneity condition?
How do we test for the presence of endogeneity?
How do we know if an instrument is relevant?
T ratio and conduct f test if f test is greater than 10 it is relevant