Evaluate Classifiers Flashcards

Question 1

Q

Why would you need evaluate classifiers?

Answer

A

To help choose optimal method and parameters

Question 2

Q

What are 3 causes to overfitting?

Answer

A

Too many variables
Excessive model complexity
Data leakage

Question 3

Q

What is the consequence of launching an overfit model?

Answer

A

Deployed model will not generalized

Question 4

Q

What is the formula for accuracy rate?

Answer

A

Accuracy rate = (# of correct classification) / (# of records in datatset)

Question 5

Q

when is accuracy alone is not a sufficient metric?

Answer

A

For imbalance classification problems

Question 6

Q

Along the lines of confusion matrix, what is the formula for accuracy?

Answer

A

Accuracy = (TP+TN) / (P + N)

Question 7

Q

What is the 2 step process for cutoff value for classification?

Answer

A

Compute probability of belonging to positive class
Compare cutoff value and classify

Question 8

Q

Along the lines of confusion matrix, what is the formula for precision?

Answer

A

Precision =
(TP) / (TP + FP)

Question 9

Q

Along the lines of confusion matrix, what is the formula for False Discovery?

Answer

A

FDR = 1 - Precision

Question 10

Q

Along the lines of confusion matrix, what is the formula for False Omission Rate?

Answer

A

FOR
= (FN) / (TN + FN)

Question 11

Q

Along the lines of confusion matrix, what is the formula for Recall?

Answer

A

Recall = (TP) / (TP + FN)

Question 12

Q

Along the lines of confusion matrix, what is the formula for False Negative Rate?

Answer

A

False Negative Rate = 1 - Recall

Question 13

Q

Along the lines of confusion matrix, what is the formula for false positive rate?

Answer

A

FPR = (FP) / (FP +TN)

Question 14

Q

What does accuracy measure overall?

Answer

A

Correctness

Question 15

Q

What type of data is accuracy good to use with?

Answer

A

Balanced datasets

Question 16

Q

When is accuracy misleading?

Answer

A

When working with imbalance classes

Question 17

Q

What does precision focus on?

Answer

A

Positive prediction

Question 18

Q

When false positives are costly, what is a good metric to use?

Answer

A

Precision

Question 19

Q

What does the ROC Curve & AUC evaluate?

Answer

A

Model performance at different thresholds

Question 20

Q

What does a higher AUC mean?

Answer

A

Better discrimination between classes

Question 21

Q

What does the ROC curve and AUC help find?

Answer

A

Optimal balance between true positive and false positive

Question 22

Q

What does the lift & Gain chart help asses?

Answer

A

How well the model ranks and prioritizes high value cases

Question 23

Q

What does the lift show?

Answer

A

Improvement over random selection

Question 24

Q

What does the gains help visualize?

Answer

A

How well the model capturees true positives early

Question 25

Q

Which evaluation metric is good for fraud detection, spam filter?

Answer

A

Precision

Question 26

Q

Which evaluation is good for medical diagnosis and security alerts?