Classification - Part 3 Flashcards

Question

How do you tune a SVM?

Answer 1

1) Transform attributes to numeric scale 2) Normalize all value ranges (0 to 1) 3) RBF kernel function 4) Use nested cross-validation to find the best values for the parameters - C (weight of slack variable) 0.03 to 30000 - gamma (kernel parameter) 0.00003 to 8

Answer 2

The humans brain

Answer 3

- The model consists of inter-connected nodes (neurons) and weighted links - Output node sums up each of the input values Classification decision: Compare output node against threshold t ``` Y= I(SUM(wi*Xi) - t > 0) Y = true -> class 1 Y = false -> class 0 ```

Answer 4

- Input layer - Hidden layer (Training ANN mean it learns the weights of the neurons (of the hidden layer)) - For each neuron, activation function (threshold) determines the output

Answer 5

1. Initialize weights (1 or random) 2. Adjust the weights so that the output of ANN is as consistent as possible with class labels of training set - Find weights that minimize the error - Back propagation algorithm - Adjustment factor: learning rate

Answer 6

- They differ from ANN by the number of layers (deep) | - Require lots of training data & GPU to determine weights and to test different network architectures

Answer 7

- Can be used for classification & numerical regression - Multi-layer neural networks are universal approximators - Model building is time consuming; application is fast - Can handle redundant attributes, difficult to handle missing values (ANN does not know which input should be assigned to missing values)

Answer 8

- Choose the right network topology - Expressive space often leads to overfitting - Use more training data (a lot more) - Step-by-step simplification of the topology (regularization) Regularization: 1) Start with several hidden layers and larger number of nodes 2) Estimate generalization error using validation dataset 3) Step by step remove nodes as long as generalization error improves

Answer 9

Hyperparameter: influences the learning process and the value is set before the learning begins (pruning thresholds for tree, k for k nearest neighbor) Parameter: Is learned from the training data (weights in ANN, splits in a tree)

Answer 10

- manually play around with different settings | - have your machine automatically test many different settings (hyperparameter optimization)

Answer 11

- Find combination of hyperparameters that results in learning the model with the lowest generalization error Select the model that is expected to generalize best on unseen records

Answer 12

- Grid Search (test all combinations in user-defined range) - Random Search (test combinations of random values) - Evolutionary Search (keep specific values that worked well)

Answer 13

- Keep data used for model selection strictly separate from data used for model evaluation (otherwise overfit) 1. Split training set into validation and training set 2. Learn multiple models with different hyperparameter values 3. Select best parameter values by testing each model on the validation set 4. Learn final model on complete training data (before split) 5. Evaluate best model on test set

Answer 14

1. We want that all examples are used for validation once | 2. use as much labeled data as possible for training

Answer 15

Model selection = |folds| * parameters value sets + best model on complete training data

Answer 16

1) To find the best hyperparameter setting (model selection) 2) To get a reliable estimate of the generalization error (model evaluation) Cross-validation for model selection does not incorporate an estimation of the generalization performance

Answer 17

Outer Cross-Validation (Model evaluation) - estimates generalization error of best model - training set is passed on to inner cross-validation in each iteration Inner cross-validation (Model selection) - searches for best parameter combination - splits outer training set into inner training and validation set - learns model m best using all outer training data

Answer 18

- Decision trees, Random Forests, ANNs, SVMs

Answer 19

- KNN, Naive Bayes

Answer 20

- Forward selection: find best single feature, add further features and test again - Backward selection: start using all features, remove features, test again Use nested cross-validation to estimate the generalization error

Answer 21

- Always ! | Otherwise you cant say that a method does not work for a task

Answer 22

- Check if classification method requires feature selection | - If yes, run automated feature selection

Answer 23

- Nested cross-validation - If computation takes too long: - better hardware - reduce number of folds - reduce parameter search space - sample data to reduce size - If exact replicability of results is required: single train, validation, test split

Classification - Part 3 Flashcards

(47 cards)