Practical Issues Flashcards

Question 1

Q

What is overfitting?

Answer

A

training error decreases but validation error increases
- does not generalize well
complex decision surfaces
- too many parameters
- high variance in data
regularization can help
- additional term in the error function dependent on the norm of the weights
variance error

Question 2

Q

What is underfitting?

Answer

A

model is not learning nor generalizing well
model too simple
- cannot capture the full
  complexity of the data
- too few parameters
- high bias in data
bias error

Question 3

Q

What are bias and variance?

Answer

A

bias
- measures the distortion of an estimate
- distance/shift
variance
- measures the dispersion of an estimate
- sparse

Question 4

Q

What is the hold-out procedure? What does it consists of?

Answer

A

a technique to evaluate a classifier
- a small portion of the training set is kept separated (validation set)
- this is done multiple times with different parameters values
  - a model is trained on the remaining portion of the dataset
  - the model is tested against the validation set [unbiased estimate of the error]
best combination is selected

Question 5

Q

What is k-fold cross validation? What does it consists of?

Answer

A

a technique to evaluate a classifier
- the training set is divided into k partitions
- repeat for different values of the hyperparameter
  - we apply the hold-out method to each of those partitions, iteratively
    - k different models are built
  - final error is obtained by averaging the individual errors
- architecture with smallest error is selected
useful for limited training sets [large test set still better]

Question 6

Q

What is the leave-one-out cross-validation?

Answer

A

A special case of k-fold cross-validation, where k=|Tr|

Question 7

Q

What happens for different values of k in the k-fold cross-validation?

Answer

A

high k -> larger training sets, small validation sets
- less bias but more variance
low k -> small training sets, larger validation sets
- more bias but less variance

Question 8

Q

Model evaluation - accuracy

Answer

A

very common
proporion of correct decisions
not appropriate when the number of positive examples is much lower than the number of negative examples [Precision, Recall and F1 better]

Question 9

Q

Model evaluation - precision

Answer

A

TP/(TP+FP)
- P(relevant/returned)
“degree of soundness” of the system

Question 10

Q

Model evaluation - recall

Answer

A

TP/(TP+FN)
- P(Returned/Relevant)
“degree of completeness” of the system

Question 11

Q

Model evaluation - FScore

Answer

A

trade-off between precision and recall
weighted harmonic mean of the precision and the recall
- F1=2pr/(p+r)

Question 12

Q

Multiclass classification

Answer

A

classification task with more than two classes

- assumption that each sample is assigned to one and only one label

Question 13

Q

Multiclass classification to binary problems

Answer

A

one-vs-rest strategy
- one classifier per class (class against rest)
- interpretable (inspect classifier) and efficient (n classes)
- most commonly used strategy
- whole dataset for training
one-vs-one strategy
- one classifier per each pair of classes
- the class with most votes is selected
- computationally slower (n*(n-1)/2 models)
- useful for kernel algorithms
- in actuality subset of dataset

Question 14

Q

Evaluation of multiclass classification

Answer

A

confusion matrix
- intuitive way to show the results
- true labels/predicted
prediction
- calculated separately for each class
- number in diagonal of cm / all elements in column
recall
- calculated separately for each class
- number in diagonal of cm / all elements in row

Question 15

Q

Micro and Macro Averaging

Answer

A

extract a single number from the class evaluation
- precision, recall or F1-score
- adopt two different kinds of average
macro averaging
- compute the score separately for each class and then taking the average value
- useful when the dataset is unbalanced
- (vc1 + vc2 + vc3)/3
micro averaging
- calculate the measure from the grand total of the numerator and denominator
- bias towards the most populated class
- (nc1 + nc2 + nc3)/(dc1 + dc2 + dc3)

Practical Issues Flashcards

(15 cards)