Week 9: Evaluation & Data Analysis Flashcards

1
Q

What is training data used for?

A

to train the algorithm

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

3 steps of train, evaulate, test

A
  1. train the model using training set
  2. evaluate/ tune model using validation set
  3. test model performance on unseen test set
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

During training what 2 data sets are available?

A
  • training
  • validation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How is a data set split into test and training sets?

A

split more or less randomly, making sure to capture important classes up front

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What percentage would training and testing sets be split into

A
  • 80% for training
  • 20% for testing
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

N-fold cross validation

A
  1. Randomise the dataset
  2. Create N equal size partitions
  3. Choose N for test set
  4. N-1 partitions for training
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What bias does cross-validation hold?

A

Cross-validation is almost unbias

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is a confusion matrix used for?

It is used to describe

A
  • used to describe the performance of a classification model
  • on a set of test data
  • for which the true values are known
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

True positives (TP) means

A

predicted yes = actual yes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

True negatives (TN) means

A

predicted no = actual no

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

False positives (FP) means…

A
  • predicted yes but actual no
  • type 1 error
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

False negative (FN)

A
  • predicted no but actual yes
  • type 2 error
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How is accuracy measured in a confusion matrix?

A

( True Positive + True Negative ) / total

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Name 3 regression evaluation metrics

A
  1. Mean Absolute Error (MAE)
  2. Mean Squared Error (MSE)
  3. Root Mean Squared Error (RMSE)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Mean absolute error describes…

A

the mean of the absolute value of the errors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Mean squared error describes…

A

the mean of the squared errors

17
Q

Root Mean Squared Error (RMSE) describes

A

The square root of the mean of the squared errors