Ensembles Flashcards

1
Q

Ensembles

A
Predict class label by multiple classifiers
Different experts. Them vote.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Bagging

A

Multiple doctors majority vote
Di ——> Mi ———> Prediction ———> Votes/Average
Unstable: small changes to train cause large changes classifier. Bagging improves.
Regression, Decision, Linear, Neural.
Stable: bagging not good idea. K nearest

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Random Forests

A

Combination of tree predictors
Di ——> Ti (subset F of n variables) ——> Save Tree no pruning.
Major voting for classif. or avg for prediction.
Good for classification, no regression.

Out Of Bag
random forest predictor by averaging trees where observation not appear
almost identical by n-fold cross

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Boosting (Adaboost)

A

Strong classifier as combination of weak classifiers.
H(x) = sign(sum( alpha_t * h_t(x) ))
alpha_t is weight to weak classifier t
alpha_t = (1/2) * ln ((1-e_t)/e_t)
weights updated: exp(-alpha_t) if correct, exp(alpha_t) if incorrect
normalize weights

Boosting work well with shallow trees
Boosting continues to reduce test error even if training error reaches zero

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Gradient Boosting

A

1- Learn regression predictor
2 -Compute error residual
3- Learn to predict residual
4- Go to 2

For instance we use MSE
y = y + alpha * D_MSE
D_MSE (gradient)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Extreme Gradient Boost (XGBoost)

A

Efficient and scalable gradient boost (classif and regression trees)
Only Numerical
Quantile Sketch
Faster on single machines

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Learning Ensembles

A

Random forests and boosting computes a set of weaks models
Combines to build stronger model
F(X) = SUM(alpha_m * Tm(X))

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Stacking Generalization

A

Trains learning algorithm to combine predictions of an heterogeneus set
First, a set off base learners are trained
Then, meta learner is trained using base classifiers.
Crossvalidation-like scheme
Better than any single one of trained models.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly