term 1 - multiple regressors - revise this for week 2 Flashcards

1
Q

what is ommitted variable bias?

A

when an ommited varibale Z is a determinant of Y and is correlated with the regressor X, then the OLS estimator will be biased

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

are all ommitted variables equal?

A

no we cannot include all ommitted variables and we dont need to. we only need to include those which are a determinant of Y and correlated with X as ommitted variable bias will occur

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

what are the two uses of regression?

A

for prediction and to estimate causal effects

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what does randomisation imply?

A

that any differences between the treatment and control groups are random and not systematically related to the treatment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what are the three ways to overcome OVB

A

run a randomised controlled experiment in which treatment is randomly assigned.
2) adopt the cross tabulation approach with finer gradations of Y and X
3) use a regression in which the ommitted variable is no longer ommitted

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

wbat does the OLS estimator solve?

A

the OLS estimator minimises the average squared differences between the actual values of Y_i and the prediction based on the estimate line. this yields OLS estimators of B_0 and B_1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

what is R^2?

A

the fraction of the varience which is expained by the regression line

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what is the adjusted R^2?

A

the adjusted r squared corrects the problem that r squared always increases when you add another regressor by penalising you for including another regressor.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

what is the equation of the adjusted R^2?

A

1- [(n-1)/(n-k-1)]* SSR/TSS Where k is the number of regressors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what occurs to the difference between the adjusted R^2 and R^2 when n is large?

A

when n is large the difference between the adjusted R^2 and R^2 decreases as k becomes relatively smaller compared to n. the adjusted R^2 is still smaller however

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

what is least squares assumptions for casual inference in multiple regression?

A

1) the conditional distribution of u given the X’s has mean zero that is, E(u| X_1i =x1,…,x_ki=xk)=0
2) (X_1i,…..,X_ki, Y_i), i =1,…,n are i.i.d
3) large outliers are unlikely : X1,…,Xk and Y have finite fourth moments
4) there is perfect multicollinearity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

what least square assumption failure leads to OVB?

A

the condition distribution of u given X’s has mean 0, if this fails OVB occurs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

what type of sampling leads to iid of the regressors and variables?

A

simple random sampling creates independent and identical distributions of the variabels

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

why do outliers need to be rare?

A

the OLS can be sensitive to large outliers, so you need to check your data to make sure there are no crazy values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

what is perfect multicollinearity?

A

when one of the regressors is an exact linear function of the other regressors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

what is the dummy variable trap?

A

when there is a set of multiple binary variables which are mutally exclusive and exhaustive. if you include all these variables and a constant then there is perfect multicollinearity

17
Q

what are the solutions to the dummy variable trap?

A

omit one of the groups or omit the intercept

18
Q

what is the solution to perfect multicollinearity

A

The solution to perfect multicollinearity is to modify your list of regressors so that you no longer have perfect multicollinearity.

19
Q

what is imperfect multicollinearity?

A

Imperfect multicollinearity occurs when two or more regressors are very highly correlated.
Imperfect multicollinearity implies that one or more of the regression coefficients willbe imprecisely estimated (estimators will have higher standard error

20
Q

what is a control variable?

A

A control variable W is a regressor included to hold constant factors that, if neglected, could lead the estimated causal effect of interest to suffer from omitted variable bias.

21
Q

what are three interchangable statements about what makes an effective contorl variable?

A

i. An effective control variable is one which, when included in the regression, makes the error term uncorrelated with the variable of interest
.ii. Holding constant the control variable(s), the variable of interest is “as if ” randomly assigned.
iii. Among individuals (entities) with the same value of the control variable(s), the variable of interest is uncorrelated with the omitted determinants of Y

22
Q

do control variables need to be causal?

A

no the control variables need not be causal and their coefficients generally do not have a causal interpretation

23
Q

how do they test a single coefficenient in multiple regression?

A

hypothesis tests and confidence intervals for a single coefficient in multiple regression follow the same logic and recipe as the slope in a single regressor model

24
Q

what is a joint hypothesis?

A

a joint hypothesis specifies a value for two or more coefficients, that is, it imposes a restriction on two or more coefficients

25
Q

why cant we test the coefficients one at a time ?

A

because the rejection rate under the null hypothesis isnt 5%. one at a time method rejects the null hypothesis too often

26
Q

what are the two solutions for joint hypothesis testing?

A

use a different critical value in this procedure- not 1.96. this is the bonferroni method but this method is rarely used in practise because of the low power
use a different test statistic designed to test both B1 and B2 at once ie the F statistic

27
Q

what is the F statistic

A

the F statistic tests all parts of a joint hypothesis at once.

28
Q

what is the formula for the F statistic in the special case of the joint hypothesis B1=B1,0 and B2=B2,0 in a regression with two regressors

A

F= 1/2 * [(t_1^2 +t_2^2 - 2p_t1,t2 t_1t_2)/(1-p_t1,t2 ^2)] where p estimates the correlation between t1 and t2 and t is the test statistic for the independent tests of the coefficents

29
Q

what is the chi squared distribution?

A

the chi squared distribution with q degrees of freddom is defined to be the distribution of the sum of q independent squared standard normal random variables

30
Q

in large samples how is F distributed?

A

in large samples, F is distributed as χ^2/q. where X is the chi squared distribution

31
Q

what is the 5% critical value for the F test?

A

chi squared value with q degrees of freedom divided by q

32
Q

what is the two methods for testing single restrictions on multiple coefficients?

A

1) rearrange the regression - rearrange the regressors so that the restriction becomes a restriction on a single coefficient in an equivalent regression
2) perform the test directly - use stata

33
Q

what is the method to rearrange the regression?

A

Y=B0+B1X1+B2X2 +U
H0: B1=B2 vs H1:B1 not=B2
add and subtract B2X1 and replace B1-B2 with y and X1+X2 as w:
Y= B0 +yX1 +B2W +u
hypothesis test :
H0: y=0 or y not=0

34
Q

does the rearranged transformation have the same R^2, predicted values and same residuals?

A

yes

35
Q

what is the 95% joint confidence set?

A

a set valued function of the data that contains the true coefficents in 95% of hypothetical repeated samples

36
Q

what does a high R^2 value mean?

A

it means the high R^2 value means that the regressors explain the variation in Y

37
Q

what does a high R^2 value not mean?

A

it does not mean that you have eliminated ommitted variable bias,
it does not mean that you have an unbiased estimator of a causal effect
it does not mean that the included variables are statistically significant - this must be done through hypothesis tests

38
Q
A