Reading 2: Multiple Regression Flashcards

1
Q

Adjusted R2

A

1 - ((n-1)/(n-k-1)*(1-R2))

  • R2 increases as variables are added to the model. Helps with “overestimation of the regression”
  • R2a will always be less than or equal to R2
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Dummy Variables

A
  • binary (either on or off)

- n class = n-1 dummy variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Heteroskedasticity

A
  • Occurs when the variance of the residuals is not constant across all observations
  • Unconditional: not related to level of independent variables (causes no major problems)
  • Conditional: related to level of independent variables and causes problems.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Effects of heteroskedasticity on regression analysis

A

1) Standard errors are unreliable
2) Coefficients are unaffected
3) t-stats will be too big or too small
4) F-test is unreliable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Detecting Heteroskedasticity

A
  • examine the scatter plots of the residuals
  • Breusch-Pagan chi-square test: n * (R2 from a second regression from the squared residuals of the first regression on independent variables)
  • *one-tailed test because heteroskedasticity is only a problem if the R2 and BP test statistic are too large.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Correcting Heteroskedasticity

A

Option 1: Calculate robust standard errors (White-corrected standard errors)
Option 2: Generalized least squares: eliminates heteroskedasticity by modifying the original equation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Serial Correlation (autocorrelation)

A

-residual terms are correlated with one another

Positive: positive regression in one time period increases the probability of observing a positive regression error for the next time period.
Negative: negative regression in one time period increases the probability of observing a negative regression error for the next time period.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Effect of Serial Correlation on Regression Analysis

A
  • Results in standard errors that are too small
  • Small standard errors will cause computed t-stats to be larger than they should be, which will cause too many Type I errors (rejection of null when it is actually true)
  • F-test will also be unreliable because the MSE will be underestimated leading to too many Type I errors
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Detecting Serial Correlation

A

-Residual plots
-Durbin-Watson statistic: 2(1-r)
-r = correlation coefficient b/w residuals from one period and those from the
previous period

Rules:

  • DW = 2 (homoskedastic and not serially correlated, r = 0)
  • DW < 2 (positively serially correlated)
  • DW > 2 (negatively serially correlated)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Durbin Watson decision rule

A

Ho: Regression has no positive serial correlation

There are upper and lower critical DW-values:

  • If DW < d1; the error terms are positively serially correlated (reject null)
  • If dl < DW < du, test is inconclusive
  • If DW > du, there is no evidence that the error terms are positively correlated.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Correcting Serial Correlation

A
  • Adjust the coefficient standard errors: Hansen method
    • Hansen method also correct for conditional heteroskedasticity (use if both are the
      issue)
  • Improve specification of the model
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Multicollinearity

A

-refers to the condition when two or more independent variables or linear combinations of independent variables are highly correlated with each other

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Effects of multicollinearity on regression analysis

A
  • unreliable coefficients
  • standard errors are artificially inflated
  • Greater probability of Type II error
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Detecting Multicollinearity

A
  • t-tests are not significantly different from zero, while F-test is significant and R2 is high
  • .7 is typically the level of correlation where multicollinearity is an issue
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Levels of Misspecification

A

1) Functional form can be misspecified
- important variables are omitted
- variables should be transformed
- data is improperly pooled: wrong time period chosen
2) Explanatory variables are correlated with the error term in time series models
- a lagged dependent variable is used as independent variable
- a function of the dependent variable is used as an independent variable (“forecasting the past”)
- Independent variables are measured with error
3) Other time-series misspecifications that result in nonstationary

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Unbiased estimator

A

-expected value of the estimator is equal to the parameter you are trying to estimate.

17
Q

Consistent estimator

A
  • accuracy of the parameter estimate increases as the sample size increases.
  • as sample size approaches infinity, standard error approaches zero