Testing and validating Flashcards
what is the generalisation error?
the error rate on new cases, it tells you how well your model will perform on instances it has never seen before
what does a low training error + high generalisation error mean?
you model is overfitting the data
what is meant by testing and validating?
trying out a model on new cases to see how well the model generalizes
what is a validating set?
a second hold out set used when you adapted a model to suit a test set, and it doesn’t work as well in production.
what do you do with the validation set?
you train various hyperparameters that work best on the training set and then you select the model and hyperparameters that performs best on the validation set.
what is cross validation?
the training set is split into k folds and each model is trained against a different combination of these subsets and validated against the remaining parts