multi-class classification Flashcards
Unnamed: 0
Unnamed: 1
- Definition of Multi-Class Classification
Multi-class classification is a type of machine learning problem where an instance can be classified into one of three or more classes. It extends the idea of binary classification which deals with only two classes.
- Examples
An example of a multi-class classification problem is digit recognition, where the aim is to categorize images into one of the 10 classes (digits 0 to 9). Another example is predicting the species of iris flowers based on sepal and petal measurements, a classic dataset in machine learning, which has 3 classes.
- Techniques
Various machine learning algorithms can handle multi-class classification out-of-the-box like K-nearest neighbors (KNN), decision trees, random forests, naive bayes, gradient boosting, and neural networks.
- One-vs-Rest (OvR) or One-vs-All (OvA)
This strategy involves training a single classifier per class, with the samples of that class as positive samples and all other samples as negatives. During inference, we select the class which outputs the highest confidence score.
- One-vs-One (OvO)
In this approach, one classifier is trained per pair of classes. At the prediction stage, the class which received the most votes is selected. OvO requires to train N(N-1)/2 classifiers where N is the number of classes.
- Softmax Function
In the context of multi-class classification, softmax function is often used in the output layer of a neural network. It converts a vector of scores into a probability distribution over the classes, which can be used to predict the most likely class, or to provide confidence scores for the possible classes.
- Loss Functions
Multi-class classification problems often use the cross-entropy loss function, which measures the dissimilarity between the predicted probability distribution and the true distribution.
- Performance Measures
Accuracy, confusion matrix, precision, recall, and F1-score are commonly used to evaluate the performance of multi-class classification models. Some of these metrics can be calculated globally, or for each class individually.
- Challenges
Some challenges include handling imbalanced datasets, where some classes have many more samples than others, and dealing with high-dimensional output spaces, where the number of classes is very large.
- Extensions
Some problems might involve predicting multiple classes for each instance (multi-label classification), or the classes might be related in a hierarchical structure.