Regression Flashcards
What is the equation for a linear regression?
Yi= a + BDi +Ei
what are the two potential outcomes?
Y1i= a+b+ei
and
Y0i= a+ei
the beta here is the parameter of interest and is the causal effect of X on Y
What is the conditional independence assumption?
Conditional on observable characteristics X, potential outcomes are independent of treatment status D
idea that is a core set of variables are satisfied and are similar in the two groups, the two outcomes are independent on treatment d
make the assumption that based on X we can assume that the error term is 0 - we are more comfortable in identifying the beta and saying it is causal
What are the four key questions for non-experimental casual inference?
- Is beta identified?
- what assumption on (X, ei) allow for causal inference and are those assumptions credible?i.e. nothing in the error term will effect your outcome conditional on X
- economic, scientific and institutional knowledge inform this credibility?
- Turns on comparability of treatment and control groups- how balanced X are between treatment and control groups? If they are similar in the observables it is plausible that they would be similar in the unonservables
What proves causation?
- most suggestive is randomised controlled experiment
- regression with a clever controlled strategy can be suggestive but less so than an experiment
What are the implications of being too controlling?
more controls are not always better to identify a causal effect
adding regression controls can possibly increase bias - for example when I si also an outcome of D - bad control
How do you deal with having many controls
report regression with and without controls and discuss the limitations of the analysis
alternative research with better models eg. instrumental variables may be required?
If CIA holds what is the imapct on the error term
If CIA holds, condition on X, treatment D is uncorleated with the error term and it is as good as randomly assigned
beta is causal
what is the biggest challenge associated with the regression?
the CIA assumption is the strongest assumption acorss all methods
challenge when there are unobservables in the error term - eg. measuring individual ambition
we can never get rid of the error term you can only ever minimise it