confusion matrices Flashcards

Question 1

Q

Positive Predictive Value:

Answer

A

This is very similar to precision, except that it takes prevalence into account. In the case where the classes are perfectly balanced (meaning the prevalence is 50%), the positive predictive value (PPV) is equivalent to precision. (More details about PPV.)

Question 2

Q

Null Error Rate:

Answer

A

This is how often you would be wrong if you always predicted the majority class. (In our example, the null error rate would be 60/165=0.36 because if you always predicted yes, you would only be wrong for the 60 “no” cases.) This can be a useful baseline metric to compare your classifier against. However, the best classifier for a particular application will sometimes have a higher error rate than the null error rate, as demonstrated by the Accuracy Paradox.

Question 3

Q

Cohen’s Kappa:

Answer

A

This is essentially a measure of how well the classifier performed as compared to how well it would have performed simply by chance. In other words, a model will have a high Kappa score if there is a big difference between the accuracy and the null error rate. (More details about Cohen’s Kappa.)

Question 4

Q

ROC Curve

Answer

A

This is a commonly used graph that summarizes the performance of a classifier over all possible thresholds. It is generated by plotting the True Positive Rate (y-axis) against the False Positive Rate (x-axis) as you vary the threshold for assigning observations to a given class. (More details about ROC Curves.)

Question 5

Q

Accuracy:

Answer

A

Overall, how often is the classifier correct?

(TP+TN)/total = (100+50)/165 = 0.91

Question 6

Q

Specificity:

Answer

A

When it’s actually no, how often does it predict no?
TN/actual no = 50/60 = 0.83
equivalent to 1 minus False Positive Rate

Question 7

Q

Precision:

Answer

A

When it predicts yes, how often is it correct?

TP/predicted yes = 100/110 = 0.91

Question 8

Q

Prevalence

Answer

A

How often does the yes condition actually occur in our sample?
actual yes/total = 105/165 = 0.64