Ensemble Flashcards

Question 1

Q

How do ensemble methods work?

Answer

A

They work by combining predictions from several estimators built with a given learning algorithm in order to improve generalization.

Question 2

Q

What kind of methods are used?

Answer

A

averaging - reduce variance of ‘strong’ estimators

- boosting - reduce bias of ‘weak’ estimators

Question 3

Q

How does averaging work?

Answer

A

It works by building several estimators independently and averaging their predictions to reduce the variance.

Question 4

Q

How does boosting work?

Answer

A

It works by building sequentially several estimators such that the combined estimator has a reduced bias.

Question 5

Q

What is ‘Pasting’ - averaging methods?

Answer

A

Algorithm that uses random subsets of samples drawn randomly from the dataset for its independent estimators

Question 6

Q

What is ‘Bagging’?

Answer

A

For averaging methods this means that the samples are drawn with replacement.

Question 7

Q

What is ‘Random Subspaces’ - averaging methods?

Answer

A

Algorithm that uses random subsets of samples drawn as random subsets of features.

Question 8

Q

What is ‘Random Patches’ - averaging methods?

Answer

A

Algorithm that uses random subsets of samples drawn as random subsets of both data and features.

Question 9

Q

RandomForests and Extra-Trees - averaging methods

Answer

A

They are combine-and-perturb methods based on constructing randomized decision trees and then averaging their prediction results.

Question 10

Q

Differences between RandomForests and ExtraTrees

Answer

A

During tree construction:

in RFs the node split is picked based on the best split among a random subset of features
in ETs the splits are drawn at random for each candidate feature with the best one being picked for the node split

Question 11

Q

What is bias?

Answer

A

It is the error from erroneous assumptions in the learning algorithm.
High bias can cause an algorithm to miss the relevant relations between features and target outputs (underfitting).

Question 12

Q

What is variance?

Answer

A

It is error from sensitivity to small fluctuations in the training set.
High variance can cause overfitting: modeling the random noise in the training data, rather than the intended outputs.

Question 13

Q

What is a decision tree?

Answer

A

It is a model for predicting a dependent variable Y using an independent var X by checking a collection of splits..

Question 14

Q

What is a decision tree split?

Answer

A

A split is a condition or query on a single independent variable that is either true or false.
Splits are arranged as a tree with 2 child nodes: left for true condition, right for false condition

Question 15

Q

Boosting intuition

Answer

A

minimizes bias by using estimators with low variance and high bias (i.e. shallow decision trees)

Question 16

Q

Bagging intuition

Answer

Study These Flashcards

A

minimizes variance by using estimators low bias and high variance (i.e. full decision trees)

Question 17

Q

What is Gini impurity?

Answer

Study These Flashcards

A

a measure of how often a randomly chosen element from a set will be classified incorrectly
the classification is done randomly according to the distribution of the labels in the set

Question 18

Q

What is information gain?

Answer

Study These Flashcards

A

a measure given by the difference between the entropy of, for example, the target variable and the entropy of the target variable conditioned by a regressor
IG(T, rgr) = H(T) - H(T | rgr)

Question 19

Q

What is ‘reduced error’ pruning?

Answer

Study These Flashcards

A

method of reducing overfitting in decision trees
‘bottom-up’ algo
starting at the leaves, each node is replaced with the most popular class. If prediction accuracy is improved, the change is kept.

Ensemble Flashcards

(19 cards)