Classification and Machine Learning Evaluation Flashcards

1
Q

What is classification?

A

This is when you find patterns in input data and divide it into categories

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is regression?

A

Building a model to solve a problem

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Define clustering?

A

This is where data which is in different categories forms clusters (groups) when graphed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the machine learning process? (5 steps)

A
  1. Data collection
  2. Feature selection
  3. Algorithm Choice
  4. Training
  5. Evaluation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is overfitting?

A

A model overfits when it describes the randomness associated with the data, rather than the underlying relationship between the data points

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is underfitting?

A

This is when the model is too simple to understand the complex problem (so it is no so accurate)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the rule with under/over fitting (occam’s razor)

A

We should use the simplest model unless we have to use a more complex one

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are the 3 types of data set and what are they each used for?

A

> Training data: Used to train the model

> Validation data: Used to evaluate the different models that we have created

> Test data: Use to test the bench mark the model at the end

[Picture 1]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is important about the data for the 3 types of data?

A

The training, validation and test data must not overlap. (contain any of the same data)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What can happen with regards to the training data if we make the model too complex?

A

It doesnt learn the underlying principle, it just starts to learn the data set.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Is it possible to get the error during training to 0? Would we want to do this? why?

A

Yes it is possible. No we do not want to do this because the AI would just be learning the data set

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

When do we stop training a machine learning AI? why?

A

We want to stop when the error of the validation data set starts to increase. The error will increase because the model is starting to over fit

[Picture 2]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What happens if you dont stop training a machine learning AI?

A

The model would become too complex and it would start to overfit

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is cross validation?

A

> When there is not enough data to create three sets large enough, cross validation is a common way to test the learned model on more data points.

> The data is split into several batches

> Each batch is tested in turn whilst others are used for training

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is a binary classifier?

A

This is a method used to evaluate a class. You form the following diagram:

[Picture 3]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How is accuracy predicted with a binary classifier? [Picture 3]

A

Accuracy % = 100 × (#TP+#FP) / (#TP + #TN + #FP + #FN)

17
Q

How is left collumn sensitivity calculated? [Picture 3]

A

Sensitivity % = 100 × #TP / (#TP + #FN)

18
Q

How is right collumn sensitivity calculated? [Picture 3]

A

Sensitivity % = 100 × #TN / (#TN + #FP)

19
Q

What is the equation for recall? [Picture 3]

A

Recall % = 100 × #TP / (#TP + #FN)

20
Q

What is recall also?

A

Recall = Left collumn Sensitivity

21
Q

How is top row precision calculated? [Picture 3]

A

Precision % = 100 × #TP / (#TP + #FP)

22
Q

How is bottom row precision calculated? [Picture 3]

A

Precision % = 100 × #TN / (#TN + #FN)

23
Q

Which is the least informative method of evaluation?

A

Accuracy alone

24
Q

What is the equation for F1?

A

F1 = 2 × (precision × recall) / (precision + recall)

25
Q

What is MCC (not the equation)?

A

Matthew’s Correlations Coefficient

26
Q

What is the equation for MCC?

A

MCC = (#TP × #TN - #FP × #TN) / √ ((#TP + #FP) (#TP + #FN) (#TN + #FP) (#TN + #FN))

27
Q

What is the benefit of MCC?

A

It is a better measurement when the data set is unbalanced

28
Q

What is the purpose of the confusion matrix?

A

It is a convenient way to represent the accuracy of a multi-class (non-binary) classifiers. It is like a heat map

[Picture 4]

29
Q

What are the axis of the confusion matrix?

A

Each entry at coordinate (x,y) in the matrix corresponds to the number of elements of class x classified as y

[Picture 4]

30
Q

What is the ideal result of a confusion matrix (best classification)?

A

All the elements should be diagonal [Picture 4]

31
Q

What is a ROC curve?

A

A convenient way to compare different models in the receiver operator characteristic

32
Q

What does a diagonal classifier of a ROC curve mean?

A

Along the line, there is a 50% chance of being correct and 50% chance of being incorrect

33
Q

What is the perfect classifier on an ROC curve?

A

The perfect classifier would be in the top left corner (0,100)

34
Q

On an ROC curve, which line is best?

A

The one with the largest area under it

35
Q

What is a parametric method?

A

This is when you decide what parameters and how many the AI is going to learn before training starts

36
Q

What happens to the training data after the AI has been training with a parametric method?

A

After the rules have been learned by the AI the data can be discarded

37
Q

What are some issues with the parametric method?

A

We cannot always separate everything with a single hyperplane

38
Q

What is a non-parametric method?

A

There are no defined parameters at the start. The parameters will be learned as training happens. This method focusses on the data rather than a particular structure