Topic 9: Multicollinearity Flashcards
What is meant by the assupmtion of no collinearity between X variables?
That there is no perfect linear relationship between the regressors
List some sources of Multicollinearity
- Data collection method employed
- Constraints on the model, where there is some causal reason for variables to be correlated
- Model specification problems (adding needless polynomials)
- Overdetermined model (more regressors then observations)
- In time series data, regressors share a common trend
Why does regression fail given perfect multicollinearity?
Because estimates represent the expected change in the regressant for a unit change in the regressor, all other variables held constant. But with perfect multicollinearity, there is no way to examine the effects of one variable without a change in the other
What is the effect of imperfect multicollinearity? Is it an assumption violation?
No The CLRM remains BLUE and the CNLRM remains BUE Standard errors are larger, just as the case when there are fewer observations or small data variation
How does multicollinearity affect the F test that B1 = B2 = B3 = 0? and R2?
It does not affect either. A key sign of multicollinearity is high R2 but insignificant t-tests
Why is multicollinearity a sample phenomena?
Because if one could run a proper experiment, then they could set all inputs, hence it has little to do with the PRF and all to do with the sample / data that we have
How reliably can pairwise correlation detect multicollinearity problems?
Only when there is a pairwise correlation. Issues arise when Xh = aXi+ bXj
How can we counter multicollinearity?
- Use prior knowledge to assume a constraint
- Different polynomial model
- Drop variables if it is clearly a case of perfect collinearity
- Can transform variables, take difference, but make cause correlation between error terms
- Combine cross sectional & time series to make panel data
- More data
What applications does multicolinearity matter for?
- When trying to establish causal relationships
- But not when forecasting, because R2 is not affected