Week 4 : Selection and Omitted Variable Bias Flashcards
How do we know if experiments are ideal?
If both internal and external validity are high, e.g. X is causing Y and results can be generalised
Why are experiments rarely ever ideal in practice?
Many experiments do not select at random:
- Not possible to have a random sample (e.g. too expensive)
- Sometimes random is not what we want when for example studying a specific population
What happens if we select the wrong cases to study?
We introduce a selection bias.
Give an example of how to avoid selecting the wrong cases to study.
Formulate a specific hypothesis, and knowing what we want the outcome of our research to be, we select only observations that support our hypothesis.
Explain the rules for selection on Y.
Selection should allow for the possibility of at least some variation on the dependent variable Y.
Give an example of allowing for some variation on Y.
How do we study if smoking causes cancer if we only select people with cancer?
What is the correlation between this selection rule for Y and causal effect?
On average, the true causal effect is larger than what we find in our study but our estimates are a lower bound of the true causal effect to compensate for this bias.
How does overestimating a causal effect come about?
If the causal effect of X on Y varies across observations e.g. non-linear
What are the effects of selection on X?
Selecting based on the values of X doesn’t restrict the variation in Y but it may limit the generality of our conclusions.
What is the issue with self-selection?
Individuals select themselves into a group, causing a biased sample.
If we fail to take Z into account as a causal variable will our estimates always be biased?
No if Z has NO effect on Y, i.e. Z is irrelevant or if Z is NOT correlated with X.
What should we do if we don’t have data on Z?
We should determine the direction of the bias.
Write the equation for causal effect.
causal effect = true causal effect + bias
causal effect = a + b ∗ c
If Z has no effect on Y then what are the values of the variables?
b = 0 and causal effect = a
If Z is not correlated to X then what are the values of the variables?
c = 0 and causal effect = a