Practical Issues Flashcards

1
Q

What is overfitting?

A
  • training error decreases but validation error increases
    • does not generalize well
  • complex decision surfaces
    • too many parameters
    • high variance in data
  • regularization can help
    • additional term in the error function dependent on the norm of the weights
  • variance error
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is underfitting?

A
  • model is not learning nor generalizing well
  • model too simple
    • cannot capture the full
      complexity of the data
    • too few parameters
    • high bias in data
  • bias error
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are bias and variance?

A
  • bias
    • measures the distortion of an estimate
    • distance/shift
  • variance
    • measures the dispersion of an estimate
    • sparse
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the hold-out procedure? What does it consists of?

A
  • a technique to evaluate a classifier
    • a small portion of the training set is kept separated (validation set)
    • this is done multiple times with different parameters values
      • a model is trained on the remaining portion of the dataset
      • the model is tested against the validation set [unbiased estimate of the error]
  • best combination is selected
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is k-fold cross validation? What does it consists of?

A
  • a technique to evaluate a classifier
    • the training set is divided into k partitions
    • repeat for different values of the hyperparameter
      • we apply the hold-out method to each of those partitions, iteratively
        • k different models are built
      • final error is obtained by averaging the individual errors
    • architecture with smallest error is selected
  • useful for limited training sets [large test set still better]
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the leave-one-out cross-validation?

A

A special case of k-fold cross-validation, where k=|Tr|

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What happens for different values of k in the k-fold cross-validation?

A
  • high k -> larger training sets, small validation sets
    • less bias but more variance
  • low k -> small training sets, larger validation sets
    • more bias but less variance
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Model evaluation - accuracy

A
  • very common
  • proporion of correct decisions
  • not appropriate when the number of positive examples is much lower than the number of negative examples [Precision, Recall and F1 better]
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Model evaluation - precision

A
  • TP/(TP+FP)
    • P(relevant/returned)
  • “degree of soundness” of the system
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Model evaluation - recall

A
  • TP/(TP+FN)
    • P(Returned/Relevant)
  • “degree of completeness” of the system
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Model evaluation - FScore

A
  • trade-off between precision and recall
  • weighted harmonic mean of the precision and the recall
    • F1=2pr/(p+r)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Multiclass classification

A
  • classification task with more than two classes

- assumption that each sample is assigned to one and only one label

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Multiclass classification to binary problems

A
  • one-vs-rest strategy
    • one classifier per class (class against rest)
    • interpretable (inspect classifier) and efficient (n classes)
    • most commonly used strategy
    • whole dataset for training
  • one-vs-one strategy
    • one classifier per each pair of classes
    • the class with most votes is selected
    • computationally slower (n*(n-1)/2 models)
    • useful for kernel algorithms
    • in actuality subset of dataset
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Evaluation of multiclass classification

A
  • confusion matrix
    • intuitive way to show the results
    • true labels/predicted
  • prediction
    • calculated separately for each class
    • number in diagonal of cm / all elements in column
  • recall
    • calculated separately for each class
    • number in diagonal of cm / all elements in row
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Micro and Macro Averaging

A
  • extract a single number from the class evaluation
    • precision, recall or F1-score
    • adopt two different kinds of average
  • macro averaging
    • compute the score separately for each class and then taking the average value
    • useful when the dataset is unbalanced
    • (vc1 + vc2 + vc3)/3
  • micro averaging
    • calculate the measure from the grand total of the numerator and denominator
    • bias towards the most populated class
    • (nc1 + nc2 + nc3)/(dc1 + dc2 + dc3)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly