Key Things to Memorize Flashcards
Actual definition of selection bias
the difference between the estimate of B1 from the (short) regression and the true causal effect of the variable on the outcome
actual definition of OVB
OVB is the mathematical difference between the regression coefficients from the short regression and the long regression.
That long regression estimates the association of subways with pollution holding fixed land area, regulation, and population. If regulation and population capture the effects of confounders, then the estimate of this long regression will be closer to the true causal effect of
interest
What does the OVB explain intuitively
Intuitively in context, the OVB formula tells us the short regression coefficient bundles 3 things (OR 2 IF ONLY 2 VARIABLES):
(1) the true causal effect of interest,
(2) the effect of Regulation on π¦ and the variation in regulation that is correlated with subways, and
(3) the effect of Population on π¦, and the variation in population that is correlated with subways
OVB formula
coef short = coef long + (coef regressor omitted in short x coef auxiliary of reg of interest)
What is the residual and when does it coincide with the error
The residual is the estimate of the error. They coincide if we have full population data in which case there is no sampling variability in the estimate of the error
How does having full population data influence OLS
OLS is only approximating the conditional
expectation function, whether you have population data or a sample. If you have population data, you perfectly learn the coefficients that best approximate the conditional expectation function. If you have a sample, you have estimates of those.
Even if you have population-level data, the π½0, π½1, and π’π might not represent the parameters and values of interest (not true causal effect)
What is the mean on residuals in OLS
0 (always 0, this is a property of OLS)
What is πΆππ£(π₯π, π’Μπ) in OLS?
0 (it is a property of OLS that residuals and regressors are uncorrelated)
How does using an instrument remove the bias in OLS when you have simultaneity (reverse causality causing bias) - explain the intuition
when you use an instrument, we use the variation in the regressor that is created by the instrument. The intuition is that the instrument will create variation in the regressor that is unrelated to the reverse causality (e.g. demand for food options)
Isolate to one channel -> removes reverse causality
What are the two assumptions for IV to be valid (in simple terms)
relevance i.e. cov(instrument, regressor) not equal to 0
exogeneity i.e. cov(instrument, ui) = 0
what are the two parts of exogeneity
exclusion = instrument only affects the outcome through the regressor
as good as randomly assigned = the instrument is uncorrelated with all unobserved determinants of the outcome (residuals)
How do you estimate the coefficient on the regressor of interest using 2SLS
form first stage, second stage and reduced form equations using your instrument
run regression on the first stage and reduced form equations to get the coefficients
coefficient of interest = reduced form coefficient / first stage coefficient
what are the first stage, second stage and reduced form equations i.e. how do you form them
second stage = the classic regression that would be biased using OLS
for example audit rate = a + b(cheating) + e
first stage = regressor of interest = a + b(instrument for the regressor of interest) + e
for example cheating = a + b(average education) + e
reduced form = plug first stage into second stage and group terms
general form -> outcome of interest = a + b(instrument) = E
for example audit rate = a + b(average education) + e
What does in mean for a parameter to be overidentified in 2SLS
if there are more instruments for the biased regressor (i.e. 2 instruments for 1 biased regressor), it is said to be overidentified
What does it mean for a parameter to be just identified in 2SLS
same amount of instruments as biased regressors (i.e. 1 instrument for the 1 biased regressor) it is said to be just identified