Lecture 8 Flashcards
What level of variance and bias is the most fundamental (high/low)?
Low bias and low variance
Which models do not perform well well by themselves due to high bias or high variance?
Logistic Regression Naive Bayes KNN (Shallow) Decision Trees Linear SVMs and Kernel SVMs
What ways can we combine models?
Voting
Bagging
Boosting
and Stacking
Bagging
Trains many models on bootstrapped data, then takes the average
Boosting
Given weak models, it runs it multiple times on reweighted training data, then lets learned classifiers vote with Gradient Boosting
Stacking
Trains many models in parallel and combines them by training a meta-model to output a prediction based on the different weak models predictions.
The training data is split into two folds. One fold is used to train the base models and the other is used to train the meta-model.
Random Forest approach
a bagging method where deep trees, fitted on bootstrap samples, are combines to produce an output with lower variance
What does Bagging aim to reduce?
variance
What does Boosting aim to reduce?
bias
Bootstrapping
a resampling technique that generates samples of size N (called bootstrap samples) from an initial dataset by randomly resampling ( drawing with replacement) N observations.