term 1 - multiple regressors - revise this for week 2 Flashcards by ben moulds

what is ommitted variable bias?

when an ommited varibale Z is a determinant of Y and is correlated with the regressor X, then the OLS estimator will be biased

How well did you know this?

Not at all

Perfectly

are all ommitted variables equal?

no we cannot include all ommitted variables and we dont need to. we only need to include those which are a determinant of Y and correlated with X as ommitted variable bias will occur

How well did you know this?

Not at all

Perfectly

what are the two uses of regression?

for prediction and to estimate causal effects

How well did you know this?

Not at all

Perfectly

what does randomisation imply?

that any differences between the treatment and control groups are random and not systematically related to the treatment

How well did you know this?

Not at all

Perfectly

what are the three ways to overcome OVB

run a randomised controlled experiment in which treatment is randomly assigned.
2) adopt the cross tabulation approach with finer gradations of Y and X
3) use a regression in which the ommitted variable is no longer ommitted

How well did you know this?

Not at all

Perfectly

wbat does the OLS estimator solve?

the OLS estimator minimises the average squared differences between the actual values of Y_i and the prediction based on the estimate line. this yields OLS estimators of B_0 and B_1

How well did you know this?

Not at all

Perfectly

what is R^2?

the fraction of the varience which is expained by the regression line

How well did you know this?

Not at all

Perfectly

what is the adjusted R^2?

the adjusted r squared corrects the problem that r squared always increases when you add another regressor by penalising you for including another regressor.

How well did you know this?

Not at all

Perfectly

what is the equation of the adjusted R^2?

1- [(n-1)/(n-k-1)]* SSR/TSS Where k is the number of regressors

How well did you know this?

Not at all

Perfectly

what occurs to the difference between the adjusted R^2 and R^2 when n is large?

when n is large the difference between the adjusted R^2 and R^2 decreases as k becomes relatively smaller compared to n. the adjusted R^2 is still smaller however

How well did you know this?

Not at all

Perfectly

what is least squares assumptions for casual inference in multiple regression?

1) the conditional distribution of u given the X’s has mean zero that is, E(u| X_1i =x1,…,x_ki=xk)=0
2) (X_1i,…..,X_ki, Y_i), i =1,…,n are i.i.d
3) large outliers are unlikely : X1,…,Xk and Y have finite fourth moments
4) there is perfect multicollinearity

How well did you know this?

Not at all

Perfectly

what least square assumption failure leads to OVB?

the condition distribution of u given X’s has mean 0, if this fails OVB occurs

How well did you know this?

Not at all

Perfectly

what type of sampling leads to iid of the regressors and variables?

simple random sampling creates independent and identical distributions of the variabels

How well did you know this?

Not at all

Perfectly

why do outliers need to be rare?

the OLS can be sensitive to large outliers, so you need to check your data to make sure there are no crazy values

How well did you know this?

Not at all

Perfectly

what is perfect multicollinearity?

when one of the regressors is an exact linear function of the other regressors

How well did you know this?

Not at all

Perfectly

what is the dummy variable trap?

Study These Flashcards

when there is a set of multiple binary variables which are mutally exclusive and exhaustive. if you include all these variables and a constant then there is perfect multicollinearity

what are the solutions to the dummy variable trap?

Study These Flashcards

omit one of the groups or omit the intercept

what is the solution to perfect multicollinearity

Study These Flashcards

The solution to perfect multicollinearity is to modify your list of regressors so that you no longer have perfect multicollinearity.

what is imperfect multicollinearity?

Study These Flashcards

Imperfect multicollinearity occurs when two or more regressors are very highly correlated.
Imperfect multicollinearity implies that one or more of the regression coefficients willbe imprecisely estimated (estimators will have higher standard error

what is a control variable?

Study These Flashcards

A control variable W is a regressor included to hold constant factors that, if neglected, could lead the estimated causal effect of interest to suffer from omitted variable bias.

what are three interchangable statements about what makes an effective contorl variable?

Study These Flashcards

i. An effective control variable is one which, when included in the regression, makes the error term uncorrelated with the variable of interest
.ii. Holding constant the control variable(s), the variable of interest is “as if ” randomly assigned.
iii. Among individuals (entities) with the same value of the control variable(s), the variable of interest is uncorrelated with the omitted determinants of Y

do control variables need to be causal?

Study These Flashcards

no the control variables need not be causal and their coefficients generally do not have a causal interpretation

how do they test a single coefficenient in multiple regression?

Study These Flashcards

hypothesis tests and confidence intervals for a single coefficient in multiple regression follow the same logic and recipe as the slope in a single regressor model

what is a joint hypothesis?

Study These Flashcards

a joint hypothesis specifies a value for two or more coefficients, that is, it imposes a restriction on two or more coefficients

why cant we test the coefficients one at a time ?

because the rejection rate under the null hypothesis isnt 5%. one at a time method rejects the null hypothesis too often

what are the two solutions for joint hypothesis testing?

use a different critical value in this procedure- not 1.96. this is the bonferroni method but this method is rarely used in practise because of the low power use a different test statistic designed to test both B1 and B2 at once ie the F statistic

what is the F statistic

the F statistic tests all parts of a joint hypothesis at once.

what is the formula for the F statistic in the special case of the joint hypothesis B1=B1,0 and B2=B2,0 in a regression with two regressors

F= 1/2 * [(t_1^2 +t_2^2 - 2p_t1,t2 *t_1*t_2)/(1-p_t1,t2 ^2)] where p estimates the correlation between t1 and t2 and t is the test statistic for the independent tests of the coefficents

what is the chi squared distribution?

the chi squared distribution with q degrees of freddom is defined to be the distribution of the sum of q independent squared standard normal random variables

in large samples how is F distributed?

in large samples, F is distributed as χ^2/q. where X is the chi squared distribution

what is the 5% critical value for the F test?

chi squared value with q degrees of freedom divided by q

what is the two methods for testing single restrictions on multiple coefficients?

1) rearrange the regression - rearrange the regressors so that the restriction becomes a restriction on a single coefficient in an equivalent regression 2) perform the test directly - use stata

what is the method to rearrange the regression?

Y=B0+B1X1+B2X2 +U H0: B1=B2 vs H1:B1 not=B2 add and subtract B2X1 and replace B1-B2 with y and X1+X2 as w: Y= B0 +yX1 +B2W +u hypothesis test : H0: y=0 or y not=0

does the rearranged transformation have the same R^2, predicted values and same residuals?

yes

what is the 95% joint confidence set?

a set valued function of the data that contains the true coefficents in 95% of hypothetical repeated samples

what does a high R^2 value mean?

it means the high R^2 value means that the regressors explain the variation in Y

what does a high R^2 value not mean?

it does not mean that you have eliminated ommitted variable bias, it does not mean that you have an unbiased estimator of a causal effect it does not mean that the included variables are statistically significant - this must be done through hypothesis tests

term 1 - multiple regressors - revise this for week 2 Flashcards

(38 cards)