L1 Lasso Flashcards
L1 regularization, also known as Lasso (Least Absolute Shrinkage and Selection Operator)
L1 regularization, also known as Lasso (Least Absolute Shrinkage and Selection Operator), is a method used to prevent overfitting in machine learning models by adding a penalty term to the loss function. In summary, L1 Lasso regularization is a powerful technique to control overfitting and perform automatic feature selection in machine learning models. However, it requires careful tuning of the regularization strength hyperparameter.
- Definition
L1 regularization is a type of regularization that adds a penalty term to the loss function, which is equivalent to the absolute value of the magnitude of the coefficients.
- Mathematical Formulation
In L1 regularization, the penalty added to the loss function is the absolute value of the magnitude of the coefficients, scaled by a hyperparameter usually denoted by lambda. If L(f) is the unregularized loss, the regularized loss L’(f) is then given by L’(f) = L(f) + λ||w||_1, where w are the model parameters.
- Feature Selection
One of the notable properties of L1 regularization is that it can push the weights of irrelevant or less important features to zero, effectively performing feature selection. This results in a sparse model where only a subset of the features are used.
- Advantages
L1 regularization helps to prevent overfitting by adding complexity to the model. This improves model generalization. It also provides an automatic feature selection mechanism, which can be useful when dealing with high-dimensional data.
- Limitations
A potential drawback of L1 regularization is that when the model is trained with highly correlated features, it will include only one of them and ignore the others. Also, choosing the right value for the regularization strength parameter (lambda) can be challenging and may require techniques like cross-validation.
- Usage
L1 regularization is used in linear regression (Lasso regression), logistic regression, and neural networks among other machine learning models.
- Parameter Tuning
The strength of the L1 regularization is controlled by a hyperparameter, usually denoted by lambda or alpha. This hyperparameter needs to be carefully tuned to find the right level of regularization. Too high a value can cause underfitting, while too low a value might not effectively control overfitting.