Multicollinearity Flashcards

1
Q

Perfect Multicollienearity

A

R^2 = 1 perfect linear relationship between the explanatory variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What happens to the estimators when multicollinearity exists?

A

Cannot identify unique estimates for parameters and therefore cannot draw any statistical inferences (ie hypothesis testing) about a sample.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Near, Imperfect Multicollinearity

A

Two or more explanatory variables aren’t exactly linearly related.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Inferior Good

A

Income increases but demand for the good declines

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

r

A

Coefficient of Correlation; used to determine the strength of or degree of collinearity. May not be adequate if multiple variables are involved.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Ordinary Least Squares (OLS)

A

OLS produces estimators with the smallest variances; they are BLUE: Best, linear, unbiased, estimate. OLS remain BLUE even when one of the partial regression coefficients are statistically insignificant.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Unbiasedness

A

the estimator provides you with the correct parameter coefficient; converges, because it is a repeated sampling property, with the true population value of the estimates

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Consequences of Multicollinearity

A
  1. OLS doesn’t destroy minimum variance property; however the numerical value of the variance does not have to be small.
  2. Large variances and standard error, wider confidence intervals
  3. Small t-value; insignificant t-ratios
  4. Failure to reject the null, resulting in a type 2 error
  5. Cannot estimate Xs influence on Y
  6. High R^2 but few statistically significant t ratios
  7. OLS estimators and their standard errors become more sensitive to small changes in the data.
  8. Wrong signs for regression coefficients
  9. difficulty in assessing the individual contributions of explanatory variables to the explained sum of squares or R^2. –> because the variables are so collinear when one moves the other does also, making it impossible to separate them.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Why kind of problem is Multicollinearity?

A

It is a sampling (regression) phenomenon; some samples can have Xs that are so collinear, it messes up the regression analysis. Xs might not be linear in the population. This problem occurs because data is typically nonexperimental; observed as they occur.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

t critical vs t value

A

t critical are the values that determine the critical region under the normal distribution, outside of which we would reject the null hypothesis at a selected level of confidence. The t value is the value we calculate using the standard error.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Sample Specific

A
  1. matter of degree, and not of presence or absence of multicollinearity/
  2. condition of the X variables that are non-stochastic; feature of the sample and not the population
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Indicators of Multicollinearity

A
  1. High R^2 but few significant t ratios
  2. High pairwise correlation among explanatory variables. >.8, there is possibility of multicollinearity
  3. Examination of partial correlations
  4. Subsidiary or auxiliary regressions
  5. Variance Inflation Factor
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Partial correlation coefficient

A

correlation btw two variables, holding the influence of the other x variables constant.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Auxiliary Regression

A

regressing each x variable on the remaining X’s to compute the corresponding R^2; these regressions are considered “subsidiary or auxiliary to the main regression. want to find coefficient of determination; then determine if it the R^2 for each is statistically significant using F test. – high R^2 can be a surface indicator

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Variance Inflation Factor

A

VIF = 1/ (1- R^2); as R^2 increases the variance and standard error increase or inflated. Undefined if perfect collinearity (R^2 =1) and 1 perfect collinearity.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Additional takeaways

A

R^2 can be counterbalanced by a low variance and sum of differences X, doesn’t have to.
Multicollinearity by itself need not cause high standard errors.

17
Q

When is Multicollinearity bad/not so bad?

A
  1. when using as as predictor, and the same relationship is expected to continue into the future (big if)
  2. If objective is to predict & reliable estimation of the individual parameters of the chosen model, then MC is bad news.
18
Q

Correlation Diagnostics

A
  1. Correlation Matrix

2. Auxiliary Regression

19
Q

Correlation Matrix

A

Getting the pairwise correlation for all of the various explanatory variables.

20
Q

Issues with Solutions

A

no sure fire way to cure MC; plus the problem remains with the sample, and may not reflect the population, OLS estimators retain their BLUE property,

21
Q

Solutions to Multicollinearity

A
  1. Dropping the Variable from the model
  2. New data or new sample
  3. Rethinking the model
  4. Prior Information about Some Parameters
  5. Transformation of Variables
22
Q

Dropping a variable from the model

A

Might want to remove, but always return to theory first. because economically the regression might be appropriate.–can lead to model specification error which will result in biased ; don’t drop a variable from an economically viable model just because of collinearity.

23
Q

Acquiring Additional Data or a New Sample

A

the new sample might not display as much MC as your original dataset. A sample with a larger n may reduce some of the MC.

24
Q

Rethinking the model

A
  1. Did you omit any variables?
  2. Is it the correct functional form?
  3. Does it align with theory?
25
Q

Prior Information about some Parameters

A

use coefficient obtain in another study that you believe still holds (tall order) and apply it to the regression; subtract the coefficient *Variable from the y and then just regress for the remaining variables. This method is also difficult because you must rely on extraneous prior info.

26
Q

Transformation of Variables

A

ex. aggregate values are transformed to per capita; this reduced some of the multicollinearity.