Jupyter Notebook 2.2 - Regularization Flashcards

Question 1

Q

What is regularization, and how does it improve machine learning models?

Answer

A

Regularization is a technique used to improve machine learning models by adding constraints to prevent overfitting. It achieves this by introducing a penalty for large coefficients in the model, which discourages complexity and encourages simpler models.

L1 Regularization (Lasso): Adds the absolute value of the coefficients as a penalty term to the loss function, which can lead to sparse models (some coefficients become zero).

L2 Regularization (Ridge): Adds the square of the coefficients as a penalty term, which reduces the magnitude of all coefficients but does not eliminate them entirely.

By constraining the model, regularization helps strike a balance between bias and variance, ultimately leading to better generalization on unseen data.

Question 2

Q

When should you use plain linear regression, ridge regression, lasso regression, or elastic net in machine learning?

Answer

A

Plain Linear Regression: Generally not recommended due to its tendency to overfit, especially in the presence of multicollinearity or when the number of features is large. Always consider including regularization.

Ridge Regression: Use when you want to handle multicollinearity or when you believe all features are potentially useful. Ridge helps shrink coefficients without eliminating them, improving model stability.

Lasso Regression: Use when you suspect that only a few features are important for the model, as it performs feature selection by driving some coefficients to zero.

Elastic Net: Use when you want a combination of ridge and lasso benefits, especially when you have many features, and you suspect that some are highly correlated. Set the mixing parameter r > 0 to benefit from both regularization techniques.

Jupyter Notebook 2.2 - Regularization Flashcards

(2 cards)