Practical Issues Flashcards
1
Q
What is overfitting?
A
- training error decreases but validation error increases
- does not generalize well
- complex decision surfaces
- too many parameters
- high variance in data
- regularization can help
- additional term in the error function dependent on the norm of the weights
- variance error
2
Q
What is underfitting?
A
- model is not learning nor generalizing well
- model too simple
- cannot capture the full
complexity of the data - too few parameters
- high bias in data
- cannot capture the full
- bias error
3
Q
What are bias and variance?
A
- bias
- measures the distortion of an estimate
- distance/shift
- variance
- measures the dispersion of an estimate
- sparse
4
Q
What is the hold-out procedure? What does it consists of?
A
- a technique to evaluate a classifier
- a small portion of the training set is kept separated (validation set)
- this is done multiple times with different parameters values
- a model is trained on the remaining portion of the dataset
- the model is tested against the validation set [unbiased estimate of the error]
- best combination is selected
5
Q
What is k-fold cross validation? What does it consists of?
A
- a technique to evaluate a classifier
- the training set is divided into k partitions
- repeat for different values of the hyperparameter
- we apply the hold-out method to each of those partitions, iteratively
- k different models are built
- final error is obtained by averaging the individual errors
- we apply the hold-out method to each of those partitions, iteratively
- architecture with smallest error is selected
- useful for limited training sets [large test set still better]
6
Q
What is the leave-one-out cross-validation?
A
A special case of k-fold cross-validation, where k=|Tr|
7
Q
What happens for different values of k in the k-fold cross-validation?
A
- high k -> larger training sets, small validation sets
- less bias but more variance
- low k -> small training sets, larger validation sets
- more bias but less variance
8
Q
Model evaluation - accuracy
A
- very common
- proporion of correct decisions
- not appropriate when the number of positive examples is much lower than the number of negative examples [Precision, Recall and F1 better]
9
Q
Model evaluation - precision
A
- TP/(TP+FP)
- P(relevant/returned)
- “degree of soundness” of the system
10
Q
Model evaluation - recall
A
- TP/(TP+FN)
- P(Returned/Relevant)
- “degree of completeness” of the system
11
Q
Model evaluation - FScore
A
- trade-off between precision and recall
- weighted harmonic mean of the precision and the recall
- F1=2pr/(p+r)
12
Q
Multiclass classification
A
- classification task with more than two classes
- assumption that each sample is assigned to one and only one label
13
Q
Multiclass classification to binary problems
A
- one-vs-rest strategy
- one classifier per class (class against rest)
- interpretable (inspect classifier) and efficient (n classes)
- most commonly used strategy
- whole dataset for training
- one-vs-one strategy
- one classifier per each pair of classes
- the class with most votes is selected
- computationally slower (n*(n-1)/2 models)
- useful for kernel algorithms
- in actuality subset of dataset
14
Q
Evaluation of multiclass classification
A
- confusion matrix
- intuitive way to show the results
- true labels/predicted
- prediction
- calculated separately for each class
- number in diagonal of cm / all elements in column
- recall
- calculated separately for each class
- number in diagonal of cm / all elements in row
15
Q
Micro and Macro Averaging
A
- extract a single number from the class evaluation
- precision, recall or F1-score
- adopt two different kinds of average
- macro averaging
- compute the score separately for each class and then taking the average value
- useful when the dataset is unbalanced
- (vc1 + vc2 + vc3)/3
- micro averaging
- calculate the measure from the grand total of the numerator and denominator
- bias towards the most populated class
- (nc1 + nc2 + nc3)/(dc1 + dc2 + dc3)