Evaluation Metrics Flashcards
In Hypothesis Testing, what is the null hypothesis?
Null Hypothesis H_0 states the assumption to be tested
What are the two types of error in hypothesis testing
Type I Error “false positive”: Rejecting H_0 when it is in fact true
Type II Error “false negative”: Failing to reject H_0 when it is in fact false
Give the formulas for misclassification rate and accuracy in Classifier Accuracy
MR = # incorrect predictions /
total predictions
ACC = # correct predictions /
total predictions
What do confusion matrices summarise
the performance of an algorithm when compared with the real classes
Give the formulas for ACC, TPR, FPR, TNR
ACC = TP+TN / TP+FP+TN+FN
TPR = TP / TP + FN
FPR = FP / FP + TN
TNR = TN / FP + TN
Give the formulas for precision and recall
A = Retrieved and relevant
B = Retrieved results
C = Relevant Results
Precision = A / A+B
Recall = A / A+C
What is a Precision-Recall (PR) curve used for?
- To study the output of a binary classifier
- Measure precision at fixed recall intervals
Give the formulas for Balance Accuracy Rate (BAR) and Balance Error Rate (BER)
BAR: Mean of TPR and TNR
BER: Mean of FPR and FNR
Give the formula of the F1 Measure
2 * Precision * Recall /
Precision + Recall
What is a decision threshold and what is its most common value?
the value (theta) used to discriminate when selecting between a positive and negative outcome.
Most common value = 0.5
What does a ROC plot visualise?
How the TPR and FPR change over many different thresholds.
What is overfitting
Model is fitted too closely to the training data (including its noise). The model cannot generalise to situations not presented during training, so it is not useful when applied to unseen data
Possible causes of overfitting
- Small training set
- Complex Model
- noise
- high dimensionality
What is peeking and what can you use to avoid it
When the performance of a model is evaluated using the same data used to train it.
Avoid peeking by using a hold-out set
What are some drawbacks to using a random split as the hold-out strategy
- Sometimes we don’t have the “luxury” of setting aside data for testing.
- Since it is a single experiment, the hold-out estimate of error rate can be misleading if we get an “unfortunate” split of the data.
- Even if we use multiple splits, some examples will never be included for training or testing, while others might be selected many times.