Hinge Loss Flashcards
Hinge loss
Hinge loss is a loss function used primarily with Support Vector Machine (SVM) models. It measures the error made by the model and aids in maximizing the margin between the decision boundary and the closest instances from different classes in the training dataset.
- Introduction
Hinge loss is used for “maximum-margin” classification, most notably for support vector machines (SVMs). The hinge loss function encourages the model to correctly classify instances and simultaneously pushes the decision boundary away from instances.
- Mathematical Formulation
Hinge loss is mathematically defined as max(0, 1 - t), where t is the raw model output (t = y * f(x)). If the instance is on the correct side and outside the margin, the loss is zero. If the instance is on the correct side but inside the margin, or on the wrong side, the loss is proportional to the distance to the margin.
- Use in SVMs
In SVMs, the model output f(x) is the result of a dot product between the instance vector and the model’s weight vector, offset by the model’s bias term. The SVM learning algorithm adjusts the weights and bias to minimize a combination of the hinge loss and a regularizing term that encourages small weights.
- Advantages
Hinge loss allows for efficient computation and optimization, and it promotes sparsity, meaning many of the weights in the learned weight vector will be zero. This can make the resulting SVM model compact and efficient for prediction.
- Disadvantages
Hinge loss is not differentiable at t = 1, which can pose problems for optimization algorithms requiring differentiability. However, this issue is typically addressed using specific optimization algorithms such as sub-gradient descent.
- Comparison to Other Loss Functions
Unlike mean square error or cross-entropy loss which penalize all misclassifications equally, hinge loss does not penalize errors as heavily if the model’s prediction was “close” to being correct. This property allows SVMs to focus more on the hardest instances near the decision boundary.
- Extension to Multi-class Classification
Hinge loss can also be used for multi-class classification tasks. In this case, the loss is defined with respect to the correct class and the maximum scoring incorrect class.