term 1 - multiple regressors - revise this for week 2 Flashcards
what is ommitted variable bias?
when an ommited varibale Z is a determinant of Y and is correlated with the regressor X, then the OLS estimator will be biased
are all ommitted variables equal?
no we cannot include all ommitted variables and we dont need to. we only need to include those which are a determinant of Y and correlated with X as ommitted variable bias will occur
what are the two uses of regression?
for prediction and to estimate causal effects
what does randomisation imply?
that any differences between the treatment and control groups are random and not systematically related to the treatment
what are the three ways to overcome OVB
run a randomised controlled experiment in which treatment is randomly assigned.
2) adopt the cross tabulation approach with finer gradations of Y and X
3) use a regression in which the ommitted variable is no longer ommitted
wbat does the OLS estimator solve?
the OLS estimator minimises the average squared differences between the actual values of Y_i and the prediction based on the estimate line. this yields OLS estimators of B_0 and B_1
what is R^2?
the fraction of the varience which is expained by the regression line
what is the adjusted R^2?
the adjusted r squared corrects the problem that r squared always increases when you add another regressor by penalising you for including another regressor.
what is the equation of the adjusted R^2?
1- [(n-1)/(n-k-1)]* SSR/TSS Where k is the number of regressors
what occurs to the difference between the adjusted R^2 and R^2 when n is large?
when n is large the difference between the adjusted R^2 and R^2 decreases as k becomes relatively smaller compared to n. the adjusted R^2 is still smaller however
what is least squares assumptions for casual inference in multiple regression?
1) the conditional distribution of u given the X’s has mean zero that is, E(u| X_1i =x1,…,x_ki=xk)=0
2) (X_1i,…..,X_ki, Y_i), i =1,…,n are i.i.d
3) large outliers are unlikely : X1,…,Xk and Y have finite fourth moments
4) there is perfect multicollinearity
what least square assumption failure leads to OVB?
the condition distribution of u given X’s has mean 0, if this fails OVB occurs
what type of sampling leads to iid of the regressors and variables?
simple random sampling creates independent and identical distributions of the variabels
why do outliers need to be rare?
the OLS can be sensitive to large outliers, so you need to check your data to make sure there are no crazy values
what is perfect multicollinearity?
when one of the regressors is an exact linear function of the other regressors