Chapter 3: Classification Flashcards
When scoring a classification model, what does accuracy refer to?
The ratio of correct predictions.
Why is accuracy not always the best way of quantifying a classification models performance?
When a data set is has a class imbalance (some classes are more frequent than others), accuracy can be misleading high and won’t show how well the model actually discriminates between class instances.
For example, if a model was being trained to classify an imagine of a cat and guesses “not cat” every time, if the dataset contains 95% images of dogs, the accuracy of the model will be 95% of the time, but doesn’t actually provide any useful predictions.
What is preferred over accuracy?
The confusion matrix.
What is a confusion matrix?
A confusion matrix counts the number of times instances of each class have been classed as a given class. The rows of the matrix are the true labels and the columns of the matrix are the predicted labels.
What is a Type I error?
Type I error refers to false positive classifications. False positive are instances that are of the negative class that are classified as positive.
What is a Type II error?
Type II error refers to false negative classifications. False negatives are instances of the positive class that are classified as negative.
What would the confusion matrix of a perfect classifier look like?
A perfect classifier would only have true positives and true negatives, therefore the confusion matrix would have non-zero values on its main diagonal (top left to bottom right).
What is a harmonic mean?
A type of average that is calculated by dividing the number of values in a data series by the sum of the reciprocals (1/x_i) of each value in the data series.
A harmonic mean gives much more weight to low values.
What is the F1 score?
The F1 score of a classifier is the harmonic mean of the precision and recall of that classifier. As it is a harmonic mean, it is only possible to get a high F1 score if both precision and recall are high.
Is it always preferable to have similar values for precision and recall?
No, in some instances you may favour a higher value of precision or recall and not be too concerned about the other.
What is an example where a high precision and low recall would be acceptable?
Instances where a false positives are far more costly than a false negative.
A content classifier that classifies if videos are suitable for children. High precision would mean that only safe videos are filtered through but low recall would mean that many safe videos are rejected.
You would rather that a lot of safe videos were rejected (low recall) but keeps only safe ones (high precision), than one that accepted more safe videos (higher recall) but lets more harmful content through (lower precision).
What is an example where a low precision and high recall would be acceptable?
Instances where a false negatives are far more costly than a false positives.
A classifier that detects if there is a weapon carried by someone by scanning an X-ray. Low precision may result in false positives that need to be checked, but having high recall would ensure nearly all weapon carrying individuals were stopped.
What is the receiver operating characteristic (ROC) curve?
A plot of true positive rate (recall) vs false positive rate (fall-out).
What does specificity refer to?
Specificity is the true negative rate, which is the ratio of negative instances classified as negative.
What does sensitivity refer to?
Sensitivity is the true positive rate, which is the ratio of positive instances classified as positive. Recall is another term for sensitivity.