Lecture 1 (Introduction and causal effects) Flashcards
Explain the Heckman selection model
In the 70’s Heckman introduced his “selection model, where he explicitly recognized selection into treatment. Heckman said that we could think about treatment effects like this:
y_i = \beta D_i + \delta X_i + \epsilon_i
Where treatment status is a latent variable, that is, it is a function of an individuals propensity to be treated
D^* = \gamma Z_i + \eta_i
Which we can not directly observe.
The source of selection bias is then if there exists a correlation between $\epsilon_i$ and $\eta_i$. Then, people that are treated will be systematically different from people that are not.
The estimation of the model is thus as follows:
We estimate an equation for the outcome given the probability of an individual having an outcome. E.g., we estimate an equation for wages given the probability of individual participation in the labor market.
We thus estimate the model in two steps:
1. First, we estimate using a Probit model to estimate the mils ratio (probability to be treated). We then use this to predict the omitted variable. 2. Then we plug in the ratio into the first equation (e.g., the wage equation)as an additional regression. Then we estimate the wage equation controlling for the propensity to be treated.
Qualitatively explain what the LaLode (1986) paper showed.
The LaLonde (1986) paper is a seminal paper in econometrics that provides insights into the evaluation of the effects of government programs. The paper aims to demonstrate that traditional methods for evaluating such programs can lead to biased and misleading results, and provides alternative methods for obtaining accurate estimates.
LaLonde (1986) used data from a real labor experiment and compared the experimental results with results using OLS with controls and Heckman’s selection model. These approaches did not do a good job of estimating the true effects. This opened up the doors to the “credibility revolution” in econometrics.
Use the outcome framework and derive the observed treatment effect in terms of the ATT + Selection bias.
Deriving the ATT + Selection bias using Potential outcome framework
- Starting with the observed outcome $Y_i$ given $D$
$E[Y_i|D_i = 1] - E[Y_i|D_i = 0]$
- Rephrasing it as potential outcome $Y_i(0) \ \& \ Y_i(1)$
$E[Y_i(1)|D_i = 1] - E[Y_i(0)|D_i = 0]$
- Add and subtract $E[Y_i(0)|D_i = 1]$
$E[Y_i(1)|D_i = 1] - E[Y_i(0)|D_i = 0] + E[Y_i(0)|D_i = 1] - E[Y_i(0)|D_i = 1]$
- Rearrange to get
$\underbrace{E[Y_i(1)|D_i = 1] - E[Y_i(0)|D_i = 1]}{ATT} + \underbrace{E[Y_i(0)|D_i = 1] - E[Y_i(0)|D_i = 0]}{\text{Selection bias}}$
If $E[Y_i(0)|D_i = 1] = E[Y_i(0)|D_i = 0]$, the individuals getting the treatment and those not, are equal, so the bias vanishes and we are left with only the ATT.
Using potential outcome framework, define the ATE and ATT
\beta^{ATE} = E[Y_i(1)-Y_i(0)]
This refers to the the treatment effect for the whole population, not just those that are in the experiment.
\beta^{ATT} = E[Y_i(1)-Y_i(0)|D_i = 1]
“The effect of going to college for the individuals that actually go to college”. This could be more interesting if e.g., the treatment is applied to a well-defined population. ATT is the main interest for cost-benefit analysis.
What is the experimental ideal condition in the potential outcome framework?
Y_i(1), Y_i(0) \perp D
if so then
E[Y_i(0)|D_i = 1] = E[Y_i(0)|D_i = 0] = E[Y_i(0)]
and
E[Y_i(1)|D_i = 1] = E[Y_i(1)|D_i = 0] = E[Y_i(1)]
Random assignment implies that treatment status is mean independent of everything in the error term; the “treatment” and the “control” groups are on average the same.
Explain the problem with multiple testing. What is the change of observing at least one false significant effect in a paper with 20 tests and a 5% significance level?
What is the chance of observing at least one significant effect even if there is no effect? This is given by
1-(1-\alpha)^n
where $\alpha$ is the significance level (e.g., 0.05) and $n$ is the number of tests.
With the numbers in the question we thus have
1-(1-0.05)^{20} \approx 65\%
What is Bonferroni correction?
What is the significant level one needs to use if running 20 tests and a significant level of 5%
A very conservative way of reducing the chance of finding false significant effects is to use Bonferroni correction. The Bonferroni correction sets the significance cut-off to alpha/n
0.05/20 = 0.25%