Chapter 10 Flashcards
What is cross-validation?
In an attempt to estimate the true generalization error of an model on a set of data, the data-set is split up in different sets of trai9ning and test data, and paramter estimation and training and test error is calculated.
What is the generalization error?
A measure of how well our model performs on average assuming that we have infinite test sets.
Hold-out cross validation:
Just 1 iteration where test and training are split up and the generalization error is assumed to be equal to the test error.
K-fold-CV:
Dataset split up K times into training and test data with N/K observations in each.
What is forward selection?
Start by having 0 attributes. Then add attribute that minimizes generalization error the most (cross-validation). Then add another variable and do cross validation again - stop when generalization error does not decrease anymore.
What is backwards selection?
start with full model and remove stuff.