Week 2 - Selection Problem & Potential Outomes Framework Flashcards
Causal Q: Does going to the hosipital make us healthier?
Intuitively, yes. The treatment - going to hospital, should have positive impact on outcome - health status.
Cross sectional data shows us that mean health status of indivuals who have gone to hospital is in fact worse than those who have not gone to hospital. How can we interpret this?
Only interpretation that can be made: There is a negative correlation between going to hospital and health status. Those who go to hospital have worse health than those who do not.
Why do people who go to hospital have worse health status than those who do not?
Selection Problem! Those who go to hospital are fundamentally different to those who do not. People who go to hospital is not random. Which leads to the selection problem.
Selection Problem
- A function of non-random sampling.
- Those who recieve treatment fundamentally different.
- Therefore we must account for changes in these difference + the effect of treatment in order to estimate the causal effect
What is needed to estimate causal effect?
- Solving the selection problem
- Creating framwork which accounts for differences in health status of those treated and those not treated.
- This is called the Potential Outcomes Framework.
Potential Outcomes framework
Shows us the potential health status of outcome of an individual who has recieved treatment and an individual who has not.
Problem with Potential Outcomes Framwork
- Counterfactual analysis is impossible
- Cannot observe an indiviual who has simultanously been treated and not been treated.
- Therefore would be impossible to observe causal effect using this hypothetical data
Selection Bias
When participants in a program (treatment group) are systematically different from non-participants (control group).
How can we solve for selection problem?
- Using Random Assingment
- Which elimated selection bias.
- e.g. randomise people who are assinged treatment.
Mean Causal Effect on the treated of treatment (ideal)
- Y1i – Y0i represents the difference in health status of an individual if they go to hospital and did not go.
- However we know this is counterfactuous
Mean Health Status between treated and untreated.
- This is what we actually observe.
- However this value is -ve and unlikely to represent actual causal effect of going to hospital
Average Causal effect of going to hospital and difference in outcomes between treated and not treated.
Causal effect (in reality)
- Causal effect = observed difference in mean health + the selection bias
- selection bias exists in this equation has their is systemic difference between indivuals who are treated and not treated, meaning that the causal effect is not only measuring the effect of treatment, but also how inate differences.
- Therefore rearranged, observed difference = mean causal effect + selection bias.
Observed difference in mean health (formulated)
How to we derive Observed difference in mean health equation? (part 1)
Observed difference ≠ Causal effect.
How to we derive observed difference equation? (Part 2)
- Mathematically we understand:
A+B= A+C-C+B - So if we subtract counterfactual term in causal equation and move it to observed equation we get this equation for Observed difference
Selection Bias
- Exists because there is a fundamental difference between those who go to hospital and those who do not go to hospital.
- Selection bias is not zero because they are different.
- Equation showing us the health status had they both not gone to hospital. That is, initial health. If the sum of the equation is different, it means that the values are different.
When Selection Bias is 0?
When the selection bias is 0, the observed difference = causal effect.
Selection bias and Causal effect share same sign & when they don’t?
- When they are the same sign, the observed difference is bigger.
- When they are opposite signs they cancel each other out.
Negative selection bias?
- A negative selection bias may lead to us underestimating the causal effect.
- if the Causal effect is 10, and the selection bias is -5, then the overall observable differnce is lower.
How to solve selection bias
- Randomisation
- Non random sampling
- However randomisation is not usually realistic.