Chapter 15 Improve Performance with Ensembles Flashcards

1
Q

EXTERNAL Q: WHY DOESN’T GRADIENT BOOST USE SAMPLE WITH REPLACEMENT?

A

in default mode, in python, gradient boost subsample parameter is equal to 1, so by default it doesn’t use sampling with replacement. But we can change it to a number (0,1] to have subsamples in this case, based on the link, sample with replacement is done, this reduces variance and increase bias

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

WHAT IS THE DIFFERENCE BETWEEN STOCHASTIC GRADIENT BOOST AND RANDOM FOREST IN SAMPLING

A

Stochastic gradient descent (as used by XGBoost) adds randomness by sampling observations and features in each stage, which is similar to the random forest algorithm except the sampling is done without replacement

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

WHAT ARE THE THREE MOST POPULAR METHODS FOR COMBINING THE PREDICTIONS FROM DIFFERENT MODELS? P100

A

ˆ Bagging. Building multiple models (typically of the same type) from different subsamples of the training dataset.
ˆ Boosting. Building multiple models (typically of the same type) each of which learns to fix the prediction errors of a prior model in the sequence of models.
ˆ Voting. Building multiple models (typically of differing types) and simple statistics (like calculating the mean) are used to combine predictions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

WHAT IS ANOTHER NAME FOR BAGGING? P101

A

Bootstrap Aggregation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

WHAT IS ANOTHER NAME FOR BAGGING? P101

A

Bootstrap Aggregation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

HOW IS THE BAGGING PROCESS DONE? P101

A

By taking multiple samples from your training dataset (with replacement) and training a model for each sample. The final output prediction is averaged across the predictions of all of the sub-models

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

WITH WHICH TYPE OF ALGORITHM DOES BAGGING PERFORM THE BEST? P101

A

High variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

WHAT IS THE SIMILARITY AND THE DIFFERENCE BETWEEN RANDOM FOREST AND BAGGED DECISION TREES? P102

A

SIMILARITY:
Random Forests is an extension of bagged decision trees.
Samples of the training dataset are taken with replacement.
DIFFERENCE: The trees in random forest are constructed in a way that reduces the correlation between individual classifiers, Specifically, rather than greedily choosing the best split point in the construction of each tree, only a random subset of features are considered for each split

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

HOW DOES BOOSTING ALGORITHMS WORK? P103

A

Boosting ensemble algorithms create a sequence of models that attempt to correct the mistakes of the models before them in the sequence.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

HOW DO BOOSTING ALGORITHMS WORK? P103

A

Boosting ensemble algorithms create a sequence of models that attempt to correct the mistakes of the models before them in the sequence.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

WHAT ARE THE TWO MOST COMMON BOOSTING ENSEMBLE ML ALGORITHMS? P103

A

ˆ AdaBoost.

ˆ Stochastic Gradient Boosting (also called Gradient Boosting Machines)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

HOW DOES ADABOOST WORK? P103

A

It generally works by weighting instances in the dataset by how easy or difficult they are to classify, allowing the algorithm to pay more or less attention to them in the construction of subsequent models.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

WHAT IS THE DIFFERENCE BETWEEN STACKING AND VOTING ENSEMBLE MODELS? P105

A

The predictions of the sub-models can be weighted, but specifying the weights for classifiers manually or even heuristically is difficult. More advanced methods can learn how to best weight the predictions from sub-models, but this is called stacking (stacked aggregation). In stacking, a weight is dedicated to each model, in Voting the results are combined using a simple statistics like mean.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly