Evaluate Classifiers Flashcards

1
Q

Why would you need evaluate classifiers?

A

To help choose optimal method and parameters

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are 3 causes to overfitting?

A
  1. Too many variables
  2. Excessive model complexity
  3. Data leakage
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the consequence of launching an overfit model?

A

Deployed model will not generalized

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the formula for accuracy rate?

A

Accuracy rate = (# of correct classification) / (# of records in datatset)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

when is accuracy alone is not a sufficient metric?

A

For imbalance classification problems

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Along the lines of confusion matrix, what is the formula for accuracy?

A

Accuracy = (TP+TN) / (P + N)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the 2 step process for cutoff value for classification?

A
  1. Compute probability of belonging to positive class
  2. Compare cutoff value and classify
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Along the lines of confusion matrix, what is the formula for precision?

A

Precision =
(TP) / (TP + FP)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Along the lines of confusion matrix, what is the formula for False Discovery?

A

FDR = 1 - Precision

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Along the lines of confusion matrix, what is the formula for False Omission Rate?

A

FOR
= (FN) / (TN + FN)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Along the lines of confusion matrix, what is the formula for Recall?

A

Recall = (TP) / (TP + FN)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Along the lines of confusion matrix, what is the formula for False Negative Rate?

A

False Negative Rate = 1 - Recall

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Along the lines of confusion matrix, what is the formula for false positive rate?

A

FPR = (FP) / (FP +TN)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What does accuracy measure overall?

A

Correctness

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What type of data is accuracy good to use with?

A

Balanced datasets

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

When is accuracy misleading?

A

When working with imbalance classes

17
Q

What does precision focus on?

A

Positive prediction

18
Q

When false positives are costly, what is a good metric to use?

19
Q

What does the ROC Curve & AUC evaluate?

A

Model performance at different thresholds

20
Q

What does a higher AUC mean?

A

Better discrimination between classes

21
Q

What does the ROC curve and AUC help find?

A

Optimal balance between true positive and false positive

22
Q

What does the lift & Gain chart help asses?

A

How well the model ranks and prioritizes high value cases

23
Q

What does the lift show?

A

Improvement over random selection

24
Q

What does the gains help visualize?

A

How well the model capturees true positives early

25
Q

Which evaluation metric is good for fraud detection, spam filter?

26
Q

Which evaluation is good for medical diagnosis and security alerts?