Question 1

What is ensemble learning?

Accepted Answer

Ensemble learning constructs a set of base classifiers from a given set of training data and aggregates the outputs into a single meta classifier

Question 2

What are the approaches for ensemble learning?

Accepted Answer

Instance manipulation Feature manipulation Class label manipulation Algorithm manipulation

Question 3

What is instance manipulation?

Accepted Answer

Generate multiple training datasets through sampling and train a base classifier over each dataset

Question 4

What is feature manipulation in the context of ensemble learning?

Accepted Answer

Generate multiple training datasets through different feature subsets and train a base classifier over each dataset

Question 5

What is class label manipulation?

Accepted Answer

Generate multiple training datasets by manipulating the class labels in a reversible manner

Question 6

What is algorithm manipulation?

Accepted Answer

Semi randomly tweak internal parameters within an algorithm to generate multiple base classifiers over a given dataset

Question 7

What is the intuition behind ensemble learning?

Accepted Answer

1. Combination of lots of weak classifiers can be at least as good as 1 strong classifier 2. A combination of strong classifiers is at least as good as the best of the classifiers

Question 8

What is the relationship between base and ensemble classifiers error if they're independent?

Accepted Answer

Logit functions

Question 9

What is the relationship between base and ensemble classifiers error if they're identical?

Accepted Answer

Linear

Question 10

What is stacking?

Accepted Answer

Use different algorithms to train multiple base classifiers Use base classifiers to generate predictions on unseen samples

Question 11

What are the pros of stacking?

Accepted Answer

Mathematically simple Able to combine heterogeneous classifiers Generally results in as good or better results than the best of the base classifiers

Question 12

What are the cons of stacking?

Accepted Answer

Computationally expensive

Question 13

What is bagging?

Accepted Answer

Bagging is used to reduce variance. Create multiple training datasets for training multiple classifiers based on the same algorithm and average the predictions

Question 14

How do we generate datasets?

Accepted Answer

Randomly sample the original dataset (N instances) N times, with replacement. Any individual instance is absent with probability (1-1/N)^N

Question 15

What is the benefit of bagging?

Accepted Answer

Possibility to parallelise computation of individual base classifiers Highly effective over noisy datasets Produces the best results on unstable models that have high variance and low bias

19-Ensembe learning Flashcards

(25 cards)