Chapter 26 Probability Scoring Metrics Flashcards

1
Q

Log loss, also called ____, ____, or ____ can be used as a measure for evaluating predicted probabilities. Each predicted probability is compared to the actual class output value (0 or 1) and a score is calculated that penalizes the probability based on the distance from the expected value. The penalty is logarithmic.

P 261

A

logistic loss, logarithmic loss, cross-entropy

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

The log loss can be implemented in Python using the ____ function in scikit-learn.

P 262

A

log_loss()

In the binary classification case, the function takes a list of true outcome values and a list of
probabilities as arguments and calculates the average log loss for the predictions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Why is log loss not suitable for imbalanced data?

P 263

A

As an average, we can expect that the score will be suitable with a balanced dataset and misleading when there is a large imbalance between the two classes in the test set.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

In Brier Scoring, predictions that are farther away from the expected probability are penalized, but more severely as in the case of log loss. True/False

P 265

A

False, Predictions that are further away from the expected probability are penalized, but less severely as in the case of log loss.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

The Brier score can be calculated in Python using the ____ function in scikit-learn.

P 265

A

Brier_score_loss()

The skill of a model can be summarized as the average Brier score across all probabilities predicted for a test dataset. This function, takes the true class values (0, 1) and the predicted probabilities for all examples in a test dataset as arguments and returns the average Brier score.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Why Brier score can be misleading when there’s a large imbalance between the classes?

P 266

A

Model skill is reported as the average Brier across the predictions in a test dataset. As with log loss, we can expect that the score will be suitable with a balanced dataset and misleading when there is a large imbalance between the two classes in the test set.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

The Brier error score is always between ____ and ____, where a model with perfect skill has a score of ____.

P265

A

0.0, 1.0, 0.0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What’s Brier Skill Score (BSS)?

P 268

A

The Brier Skill Score reports the relative skill of the probability prediction over the naive forecast.
BSS = 1 − ( BS/ BSref )
Where BS is the Brier skill of model, and BSref is the Brier skill of the naive prediction.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

When does tuning the threshold become important?

P 269

A

Tuning the threshold by the operator is particularly important on problems where one type of error is more or less important than another or when a model makes disproportionately more or less of a specific type of error.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

The Receiver Operating Characteristic, or ROC, curve is a plot of ____ versus ____ for the predictions of a model for multiple thresholds between 0.0 and 1.0.

P 269

A

the true positive rate, the false positive rate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What does ROC-AUC show?

P 271

A

The integrated area under the ROC curve, called AUC or ROC AUC, provides a measure of the skill of the model across all evaluated thresholds.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

The ROC-AUC score can be calculated in Python using the ____ function in scikit-learn.

P 271

A

Roc_auc_score()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

An ROC-AUC score is a measure of the likelihood that the model that produced the predictions will rank a randomly chosen positive example above a randomly chosen negative example. Specifically, that the probability will be higher for a real event (class = 1) than a real non-event (class = 0). This is an instructive definition that offers two important intuitions:
ˆ Naive Prediction.(under ROC AUC)
ˆ Insensitivity to Class Imbalance.
Explain what each of these intuitions mean.

P 271

A
  • A naive prediction under ROC AUC is any constant probability. If the same probability is predicted for every example, there is no discrimination between positive and negative cases, therefore the model has no skill (AUC=0.5).
  • ROC AUC is a summary on the models ability to correctly discriminate a single example across different thresholds. As such, it is unconcerned with the base likelihood of each class.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Why is ROC_AUC a better tool for model selection rather than in quantifying the practical skill of a model’s predicted probabilities?

P 272

A

An important consideration in choosing the ROC AUC is that it does not summarize the specific discriminative power of the model, rather the general discriminative power across all thresholds. It might be a better tool for model selection rather than in quantifying the practical skill of a model’s predicted probabilities.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly