Measuring Performance Flashcards

Question 1

Q

What is Classification Accuracy?

Answer

A

number of correctly classified samples

/

total samples

Question 2

Q

What is Classification Error Rate?

Answer

A

number of wrongly classified samples

/

total samples

Question 3

Q

What is Recall?

Answer

A

TP

/

TP + FN

Question 4

Q

What is Precision?

Answer

A

TP

/

TP + FP

Question 5

Q

What are some measures we can use to measure performance in regression?

Answer

A

Root Mean Squared Error
Mean Absolute Error
Mean Absolute Percentage Error
Coefficient of Determination

Question 6

Q

What is Root Mean Squared Error?

Answer

A

Take the average of the square differences, and then square root

Question 7

Q

What is Mean Absolute Error?

Answer

A

Take the average of the absolute differences

Question 8

Q

What is the Mean Absolute Percentage Error?

Answer

A

Take the average of ( the absolute differences divided by the true value)

Question 9

Q

When is Mean Absolute Percentage Error useful?

Answer

A

When different classes in our output might give drastically different range of values.

E.g one output might be temperature, and the other might be kilo calories. Therefore we should normalise the error for each before summing them up

Question 10

Q

What is the Bias Issue?

Answer

A

The idea that the accuracy of the training samples can be a poor estimator of the accuracy on unseen samples

Question 11

Q

What is the Variance Issue?

Answer

A

The idea that the accuracy on a new set of test samples can still vary from the true accuracy, depending on the makeup of the test samples

Smaller set of test samples can result in a higher variance

Question 12

Q

What is the Holdout Method?

Answer

A

Splitting the dataset into training and testing dataset

Question 13

Q

What is Random Sub-Sampling?

Answer

A

Split the data in to K parts
Randomly select a fixed number for training and testing
Train the classifier from scratch using the training data, and then test to compute an error
Repeat the experiment from steps 1 and 2, K times
The final error is the average error of all the experiments

Question 14

Q

What is K-Fold Cross Validation?

Answer

A

Divide the dataset into K partitions
For each of the K experiments, use K-1 partitions for training, and estimate the error using the remaining partition
The final error estimate is the average error of all K experiments

Question 15

Q

What is an advantage of using K-Fold Cross Validation?

Answer

A

All the examples in the dataset are eventually used for both training and testing

Question 16

Q

What is Leave One Out Cross Validation?

Answer

Study These Flashcards

A

This is an extreme case of K-Fold Cross Validation.

Where if we have a dataset of N samples, we do N-Fold Cross Validation.

Question 17

Q

What is Bootstrap?

Answer

Study These Flashcards

A

For K experiments, select the training samples using sampling with replacement
And use the data that was never selected as testing data

The final error is the average error of the K experiments

Question 18

Q

Comparing the errors of two models using Z-Tests

Answer

Study These Flashcards

A

???

Question 19

Q

What is F-1 Score?

Answer

Study These Flashcards

A

2 * P * R

/

P + R

Question 20

Q

What is the Variance Error?

Answer

Study These Flashcards

A

This tells you how much the predictions changes under different realisations of the same model

Question 21

Q

What is the Bias Error?

Answer

Study These Flashcards

A

This is a measure of how the close the average prediction is to the true value, under different realisations of the same model.

Question 22

Q

If I have high variance error, and low bias error, will I suffer from over-fitting or under-fitting?

Answer

Study These Flashcards

A

Over-fitting

Question 23

Q

If I have low-variance error, and high bias error, will I suffer from over-fitting or under-fitting?

Answer

Study These Flashcards

A

Under-fitting

Measuring Performance Flashcards

(23 cards)