Chapter 3 - Classification Flashcards
What performance measures can we use for a classifier?
1) Cross-validation
2) confusion matrix
3) precision and recall
4) The ROC curve
What do we mean by imbalanced classification problem ?
we have two classes we need to identify (for example, terrorists and not terrorists) with one category representing the overwhelming majority of the data points.
What is the difference between a binary classifier and a multiclass classifier ?
The binary model, classify object into 2 categories. The multiclass model classify object into multiple categories. Note that in these model, the object is part of 1 and only 1 class.
When talking about multiclass classifier, what is the difference between a one-versus-the-rest (OvR) strategy and a one-versus-one (OvO) strategy?
For example, let say we have 10 class to predict. We could use 10 binary classifiers predicting if each object is part of a particular class or not and take the highest probability, that (OvR). If instead, we train the model to predict if the object is in class 1 versus 2 and then 1 versus 3,…, until we get to 9 versus 10, then that (OvO).
What do the row and column represent in a confusion matrix
The row represent the actual classes, while the columns represent predicted classes.
What do we mean by multilabel classification ?
In multilabel classification an object could be part of multiple classes at the same time. For example, let say a classifier is trained to recognize three face, Alice, Bob and Charlie. Then when the classifier is shown a picture of Alice and Charlie, it should output [1,0,1], meaning Alice yes, Bob no, and Charlie yes.