10 Multicollinearity in MLR Flashcards
min|| πΏπ½βπ¦2||
MLR solution, requires π β₯ π
If πΏTπΏ is not full rank:
* β¦
If the columns of πΏ are highly correlated: * β¦
No unique solution to normal equations.
Leads to unstable equation/plane
in the direction with little variability.
Considerations in High Dimensions
While ..
Data sets containing more features than observations are often referred to a high-dimensional.
When the number of features π is as large as, or larger than, the number of observations π, β¦
* It is too β¦
β¦. are particularly useful for performing regression in the high-dimensional setting.
π can be extremely large, the number of observations π is often limited due to cost, sample availability, etc.
OLS should not be performed.
flexible and hence overfits the data.
Forward stepwise selection, ridge regression, lasso, and PCR
Solutions to Multicollinearity
Subset selection
* β¦* (already discussed in the context of the linear regression)
Using derived input
* β¦
Coefficient shrinkage (regularization) => next class
best subset, backward, forward, stepwise selection of features
Principal component regression * Partial least squares
- Ridge regression
- Lasso (least absolute shrinkage and selection operator)..