Controlling for observables Flashcards
What is the difference between prediction and estimating causal effects in a multiple regression?
Prediction: focus on the outcome (all independent variables equally important)
Estimating: focus on the effect of T (other variables are only control variables)
Endogeneity + reasons
Correlation between the variable of interest and the error term
- Reverse causality
- Measurement error
- Omitted variables: are not in the model and related to both T and Y
Derive the omitted variable bias (OVB) formula
!!!!!
Omitted variable + which control variables should not be included
Omitted variable has influence on both T and Y
No need to control for variables that only influence T (because it is not part of the error term) and variables that only influence Y (because it is not correlated to T)
Reasons why including mechanisms biases the treatment effect
- Takes away part of the causal effect of T
2. Reintroduces ‘composition’ selection bias
Confounders
Non-confounders
Mechanisms
Colliders
Confounders = control variables Non-confounders = influence only Y Mechanism = influenced by T, influence on Y Colliders = T and Y influence C
Conditional Independence Assumption (CIA)
Controlling for observables aims to solve selection bias: assumption that the difference between treated and control is only due to the observable characteristic X
E [Y(0) | X, T = 1] = E [Y(0)| X, T = 0] = alfa + betaX
Plausibility of CIA
CIA does not hold:
If there are omitted variables other than X related to T and Y (treatment and control groups differ in unobserved ways)
If X is influenced by treatment (X should be measured before treatment takes place)