Chapter 4- Experimental Methods 1 Flashcards

Question 1

Q

what is model selection?

Answer

A

picking the best from a pool of possible models

Question 2

Q

what is cross validation error?

Answer

A

average the errors that happened in each fold

Question 3

Q

what makes a model more “stable”?

Answer

A

a lower standard deviation

Question 4

Q

what is loocv

Answer

A

leave one out cross validation- the number of folds is the same as the number of examples

Question 5

Q

is the cross validation error a good estimate of future generalisation error?

Answer

A

no, it is an optimistic estimate.

Question 6

Q

how to choose a model from a pool and get a good estimate of future generalisation error from cross validation?

Answer

A

split the data into folds
keep the last fold as a hold out set
perform cross validation on the remaining folds
select the model that performs best on these
evaluate the model on the hold out set

Question 7

Q

why do we perform feature scaling?

Answer

A

speeds up gradient descent by avoiding many extra iterations that are required when one or more features take on much larger values than the rest

Question 8

Q

what are two methods of data normalisation

Answer

A

zero mean, unit variance

restrict range

Question 9

Q

give the equation for zero mean, unit variance normalisation

Answer

A

(x - x_mean) / sigma

Question 10

Q

give the equation for restrict range normalisation

Answer

A

(x - x_min) / (x_max - x_min)

Question 11

Q

in scikit learn the two parameters we use to define convergence are

Answer

A

tol

max_iter

Question 12

Q

two resampling methods for class imbalance are

Answer

A

undersampling

oversampling

Question 13

Q

what is the main method of oversampling for class imbalance

Answer

A

data augmentation, smote

Question 14

Q

what are the two methods of dealing with missing data

Answer

A

data imputation

remove the row

Question 15

Q

what are the three methods of data imputation for missing data

Answer

A

mean imputation

regression

multiple imputation

Chapter 4- Experimental Methods 1 Flashcards

(15 cards)