SU5 - Further Issues In Linear Regression: Modelling And Inference Flashcards

Question 1

Q

What is R squared?

Answer

A

Statistic for evaluating if the model fits the data well

Question 2

Q

4 Properties of R squared?

Answer

A

1) R squared cannot be negative.
2) R squared is bounded between 0 and 1
3) R squared = 0 if SSE = 0
4) The 𝑅2 is non decreasing as you add more explanatory variables into the model

Question 3

Q

Why R squared cannot be negative?

Answer

A

because SSE and SST, sums of squared terms cannot be negative

Question 4

Q

Why R squared = 1 if SSR = 0?

Answer

A

SST = SSE

Variation in Y is completed accounted by variation in Y(hat). In this case, Y(hat) fits the data perfectly.

Question 5

Q

Why R squared = 0 if SSE = 0?

Answer

A

R squared = 0 if Y(hat) has no variations. If Y(hat) has no variations, it does not explain Y at all.

Question 6

Q

Does R square decrease?

Answer

A

Never decreases, usually increases when another independent variable is added to a regression. Thus it is a poor tool for deciding whether one variable or several variables should be added to a model

Question 7

Q

What is a regression through the origin?

Answer

A

A regression without an intercept term

ln(1+𝑡𝑎𝑥)=𝜃1ln(1+𝑖𝑛𝑐𝑜𝑚𝑒)+𝑒 where “1” is added to tax and income to prevent the log of zero.

Question 8

Q

Why is regression through the origin a bad idea?

Answer

A

1) If intercept is really zero, no harm adding in the intercept. If intercept is not zero, then estimates of both intercept and slope coefficient will be wrong
2) With regression through the origin, it is possible for R squared to be negative even though R squared should be between 0 and 1

Question 9

Q

What happens if we include an irrelevant variable?

Answer

A

The unbiasedness of the regression will not be affected but the variance could be affected

Question 10

Q

What happens if we omit a relevant variable?

Answer

A

Generally causes the OLS estimators to be biased (omitted variable bias) or worse, inconsistent

Question 11

Q

How to deal with multicollinearity?

Answer

A

1) Increase sample size

2) drop some variables that are highly collinear

Question 12

Q

What does multicollinearity affect?

Answer

A

Will not affect the variances of all OLS slope estimators. Only those from highly correlated regressors are affected

Question 13

Q

What happens if Xj is highly correlated with one or more regressors in the model?

Answer

A

R2 will be very high and Var will be large, causing 𝜃𝑗 to be imprecise

Question 14

Q

What is the error normality assumption?

Answer

A

The population error e is independent of the explanatory variables and is normally distributed with zero mean and variance.

Question 15

Q

What is the linear model called when it is under six assumptions?

Answer

A

The classical linear model (CLM)

Assumptions of linear regression 1-5 + error normality

Question 16

Q

What is the population model expressed as when under CLM assumptions?

Answer

Study These Flashcards

A

𝑌|𝑋∼𝑁𝑜𝑟𝑚𝑎𝑙(𝜃0+𝜃1𝑋1+𝜃2𝑋2+⋯+𝜃𝑘𝑋𝑘,𝜎2)

OLS have an exact normal distribution

Question 17

Q

In Var(ˆθj)=σ2/SSTj(1−R2j), Why is σ2 unknown?

Answer

Study These Flashcards

A

o2 is the variance of e and we do not observe what e is.

Question 18

Q

Why do we use n-k-1 rather than n in the denominator.?

Answer

Study These Flashcards

A

estimated o2 will be an unbiased estimator of o2

Question 19

Q

What is an unrestricted model?

Answer

Study These Flashcards

A

It’s a model containing all the regressors prior to making the hypothesis

Question 20

Q

what is a restricted model?

Answer

Study These Flashcards

A

it is the model if H0 were true

Question 21

Q

How to determine if H0 is true for multiple hypothesis testing?

Answer

Study These Flashcards

A

Comparing the error term of unrestricted model vs the error term of the restricted model

Question 22

Q

If H0 is true, what would the SSR of both be like?

Answer

Study These Flashcards

A

SSR (restricted) would be the same as SSR (unrestricted) since the residuals should be the same

F statistic would be small

Question 23

Q

If H0 is untrue, what would the SSR of both be like?

Answer

Study These Flashcards

A

SSR (restricted) would be much larger than SSR (unrestricted)

SU5 - Further Issues In Linear Regression: Modelling And Inference Flashcards

(23 cards)