19-Ensembe learning Flashcards
What is ensemble learning?
Ensemble learning constructs a set of base classifiers from a given set of training data and aggregates the outputs into a single meta classifier
What are the approaches for ensemble learning?
Instance manipulation
Feature manipulation
Class label manipulation
Algorithm manipulation
What is instance manipulation?
Generate multiple training datasets through sampling and train a base classifier over each dataset
What is feature manipulation in the context of ensemble learning?
Generate multiple training datasets through different feature subsets and train a base classifier over each dataset
What is class label manipulation?
Generate multiple training datasets by manipulating the class labels in a reversible manner
What is algorithm manipulation?
Semi randomly tweak internal parameters within an algorithm to generate multiple base classifiers over a given dataset
What is the intuition behind ensemble learning?
- Combination of lots of weak classifiers can be at least as good as 1 strong classifier
- A combination of strong classifiers is at least as good as the best of the classifiers
What is the relationship between base and ensemble classifiers error if they’re independent?
Logit functions
What is the relationship between base and ensemble classifiers error if they’re identical?
Linear
What is stacking?
Use different algorithms to train multiple base classifiers
Use base classifiers to generate predictions on unseen samples
What are the pros of stacking?
Mathematically simple
Able to combine heterogeneous classifiers
Generally results in as good or better results than the best of the base classifiers
What are the cons of stacking?
Computationally expensive
What is bagging?
Bagging is used to reduce variance.
Create multiple training datasets for training multiple classifiers based on the same algorithm and average the predictions
How do we generate datasets?
Randomly sample the original dataset (N instances) N times, with replacement. Any individual instance is absent with probability (1-1/N)^N
What is the benefit of bagging?
Possibility to parallelise computation of individual base classifiers
Highly effective over noisy datasets
Produces the best results on unstable models that have high variance and low bias