L11 - (PL) Endogeneity and IV Estimator Flashcards
What does a basic linear regression look like?
- y is the dependent variable,
- x is an explanatory variable or regressor,
- u is the error term or disturbance –> assumed obe normally distributed
- β’s are unknown parameters of interest to be estimated.
OLS estimator become inconsistent if the explanatory variable and the errors aren’t independent of each other –> OLS estimator can no longer be given a causal interpretation (marginal effect on Y as a exogenous change on X)
Difference between Exogenous and Endogenous variables?
- Exogenous variable is determined outside the model and is imposed on the model ==> not correlated with the error term.
- The endogenous variable is determined by the model ==> correlated with the error term.
Sources of Endogeneity?
- Omitted Variables
- E(y|x,q) –> conditional expectation of interest (can be written as a linear function of parameter x and q)
- if q is unobserved (part of the error term), and correlated to x –> x is therefore correlated to the error term and, therefore, endogenous
- E(y|x,q) –> conditional expectation of interest (can be written as a linear function of parameter x and q)
- Measurement Error
- Simultaneity
How does endogeneity affect our coefficient estimates?
- Example if we want to estimate the returns to exogenous changes in schooling with a normal linear regression
- u is thought to be correlated with educatoin because of other factors, for example omitted ability, quality of education and family background.
What is the Instrumental Variable (IV) approach to dealing with endogeneity?
- General solution to find only the exogenous variation in x
What are the properities of instrument z?
- z doesnt directly affect y, only indirectly through x
- We must sub the xk formula including the instrument into our equation of interest (y = …)
What is the IV Estimator?
- Both the indirect and direct effect included
What is the efficiency of the IV Estimator?
What is the Order Condition for the IV estimator?
- THe number of instruments must be at least equal to or greater than the number of independent endogenous components
- If equal –> model is said to be just identified
- If greater –> model is said to overidentified
When do you use Two-Stage Least Squares?
- Most efficient estimator for IV when you have more than one variable
- OMITTED VARIABLES CAN ALSO BE TIME VARIANT
What does 2SLS estimator help you decide?
What are the stages of the 2SLS?