Lecture 12 Flashcards
When does endogeneity occur?
When the x term is correlated with the error term
When could cov(x,u) =/0 arise?
- omitted variables
- simultaneity, so dependent and independent are determined simultaneously, so there is a feedback loop - like price n quantity in Supply and Demand diagrams
- measurement error means observed x deviates from true x, which might cause correlation with u
Example of simultaneity
Let’s say we want to see if gov spending causes less unemployment
- but gov often spends more in areas with higher unemployment
- therefore unemployment itself influences gov spending
- thus, if we ignore reverse causality, could misinterpret the correlation, like seeing higher spending and unemployment which indicates a positive correlation - misleading.
Unbiasedness does not mean consistency
Unbiasedness applies to small samples, so an estimator is unbiased if on average it hits the true value of the parameter in repeated samples. E[u|x] = 0
Consistency applies to large samples, so estimator is consistent if as the sample size grows infinitely large, estimates converge to true value. Cov(x,u) = 0
SLR.4 implies cov(x,u) = 0, but not vice versa
Whats the basic idea of instrumental variables
Introduce a 3rd variable z, which affects x but not u, helps isolate the variation in x which is exogenous to u
IV assumptions
- consider yi = B0 + B1Xi + ui, where cov(u,x) is not 0
- cov(zi,ui) = 0, condition here is theoretical and cant be tested as it depends on ui - which is unobservable
- cov(zi,xi) is not 0
Basically z is unrelated to u, and z affects yi only through xi
How to test for instrument relevance
Xi = pi0 + pi1zi + vi
Since pi1 = cov(zi,xi)/var(zi), we MUST test relevance
Perform t tests
- H0: pi1 = 0
- H1: pi1 is not 0
IV estimator, B1
B1 = Cov(zi,yi)/cov(zi,xi), then divide top and bottom by var(zi)
- gives you slope coefficient estimator from the reduced form divided by the slope coefficient estimator from the first stage
B1^ = OLS estimator, but with z instead of x
Special case of IV: Wald Estimator
When the instrument z is binary:
- E[yi|zi=1] = B0 + B1E[xi|zi=1]
- E[yi|zi=0] = B0 + B1E[xi|zi=0]
E[yi|zi=1] - E[yi|zi=0] = B1(E[xi|zi=1] - E[xi|zi=0])
Rearrange for B1 to have the Wald Estimator
Example of where Wald Estimate has been used
Ln(wagei) = B0 + B1educi + ui
- quarter of birth affects years of education and is uncorrelated with ui, so is a valid instrument
Variance of IV estimator:
Var^(Biv^) =
(O^2^)/(SSTx(R^2))
- the r^2 from a regression of xi on zi and an intercept
- o^2^ = (1/n-2)SUM(ui^2^)
IV vs OLS
Advantage of IV: consistent even if u and x are correlated, in which case, the OLS estimator is biased and inconsistent
Disadvantage of IV estimator: less efficient if u and x are uncorrelated
Variance of the IV estimator is always larger than the variance of the OLS estimator and depends crucially on the correlation between z and x
Weak instruments and Bias:
- weak instrument means that z and x are only weakly correlated, so leads to imprecise IV estimates, but also can give large bias
- ## mathematically, if the denominator is small, so a weak instrument, the second term becomes very large - representing a lot of bias
What is the rule of thumb with weak instruments
- if instruments are weak, sampling distribution is not well approximated by normal, even in large samples
RoT - F statistic above 10, same as t statistic above root 10 means its roughly strong enough
IV in the MLR model
To consistently estimate all of the Bs, we use the sample analogs of the moment conditions:
- E[ui] = 0
- cov(ui,zi) = 0
- cov (ui,xi2) = 0
Where xi2 is the exogenous explanatory variable, unlike xi1
Solve 3 equations, 3 unknowns.
IV in the MLR, what happens between the exogenous and the endogenous explanatory variables?
Z must be correlated with x1, correlation must hold even after controlling for x2
- to verify validity of z as an instrument, perform t test when regressing zi and xi2 on xi1, with pi = 0 or not
Exogeneity condition is now: cov(zi,ui|xi2) = 0, meaning after controlling for xi2, zi should have no correlation with ui
What about the 2SLS model?
- how is it different to what we have so far?
- why use it?
- what is the test for instrument relevance?
- multiple instruments Z1,…Zn, so that first stage regression of endogenous variable x1 on them is longer
- multiple instruments improve the precision of estimates and allow for overestimation tests
- H0: pi1 = pi2 = … = pin = 0, F test across multiple instruments
2SLS, step-by-step model
- Estimate the first stage regression, regressing the endogenous explanatory variable on the instruments and all the other exogenous explanatory variables
- Compute the predicted value of x1, xi1^
- Yi = B0 + B1xi1^ + B2xi2 + ei, regressing the outcome variable on xi1^, and all the other exogenous explanatory variables
Potential issues with adding instruments
- adding instruments with low predictive power in the 1st stage lowers the F statistic and exacerbates the bias in the 2SLS estimator
Testing for endogeneity: Hausman test
H0: cov(xi1,u) = 0 and H1 the opposite
- in the null, both OLS and IV are consistent, IV only consistent in the alternative
0. 1st stage regression, vi is the residual, capturing part of xi1 not explained by Zi
1. Calculate the 1st stage residual, xi1 - xi1^ = vi^
2. Add vi^ to the regression model, and estimate by OLS
3. If xi1 is exogenous, vi^ should not be correlated with ui, so theta should be 0
4. Test this using a t test
Difference between over-identification and just identified.
- why does this matter
- if we have exactly as many instruments as endogenous variables, model is just identified
- if we have more instruments than endogenous variables, the model is over-identified.
- overidentification allows for validity testing - we can check whether instruments satisfy the exogeneity condition
Testing overidentification
- Estimate the 2SLS regression and obtain residuals
- Regress residuals on all excluded instruments, and any other exogenous variables in the model - record the R^2 from this regression
- Null is that all Ivs are exogenous, run the chi squared test, with M-1 degrees of freedom - where M is the number of instruments
Difference between LATE and ATE:
- LATE is the effect of treatment on outcome for subgroup of individuals whose treatment status is affected by the instrument
- ATE is the effect of treatment on outcome averaged across entire population
When can LATE = ATE
Biv = E[B1,pi1]/E[pi1] = E[B1i]
- when causal affects are homogenous, so all individuals have same treatment effect, so B1i = B1
- instrument affects all individuals equally, LATE equals ATE as there is no subgroup variation in how Z influences X, so pi1i = pi1
- when the heterogeneity in the TE and in the effect of the instrument are uncorrelated, E[B1ipi1i] = E[B1i].E[pi1i]