5 : Predictive Analytics: Nearest Neighbors and Ensembles Flashcards

1
Q

Question 1
Level: easy
Ensemble learners combine classifiers to increase the performance. Applying the same algorithm independently to each of several bootstrapped samples from the dataset with the same set of input variables and then aggregating the results by means of majority voting is called:
a) Random Forests
b) Boosting
c) Bagging.
d) Stacking.

A

c) Bagging.

Bagging= bootstrap aggregation, is the ensemble learning method that is commonly used to reduce variance within a noisy dataset.

Boosting= Train multiple classifiers using a weighted sample of the training data
Iteratively reweight the training distribution according to the classification error

Stacking= 3 level model; first use inputs as predictors, then train mod”l using those predictions

Random forest= Random forests are a way of averaging multiple deep decision trees, trained on different parts of the same training set, with the goal of reducing the variance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Question 2
Level: easy
Which kind of models are typically improved by developing an ensemble of them:
a) Models with high bias
b) Models with high variance
c) Stable models
d) Models with high bias and variance

A

b) Models with high variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Question 1
Explain the bias-variance trade-off, and relate to the AIC and BIC measures.

A

Akaike Information Criterion
AIC is an estimation of the relative quality of a statistical model for a given set of data. It balances the goodness of fit with the simplicity of the model, penalizing complexity.
AIC incorporates both bias and variance considerations by favoring models that fit the data well while penalizing those with more parameters (higher complexity).

Bayesian Information Criterion
BIC is similar to AIC but places a stronger penalty on models with more parameters. It is particularly useful when the focus is on selecting a simpler model.
BIC tends to favor more parsimonious models, helping to avoid overfitting.

High Bias (Underfitting):
AIC and BIC may prefer a more complex model to capture underlying patterns, potentially mitigating underfitting.

High Variance (Overfitting):
AIC and BIC favor simpler models, discouraging overfitting and reducing variance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Question 2
How are bagging, boosting and random forests different?

A

Bagging focuses on creating diverse models through parallel (independently) training with bootstrap sampling (replacement from dataset to create new subset of data).

Boosting emphasizes sequential training, giving more weight to difficult instances. Each subsequent model focuses on correcting errors made by the previous ones.

Random Forests combine bagging with feature randomization to enhance model diversity.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Question 3
Why do ensemble methods typically use decision trees, decision stumps or neural networks as base-learners?

A

bcs they have unstable classifiers aka (bcs yields diff results, which is what you want in ensembles; combine ddiff etsimeates to arrive at final estimate)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Question 4
What is a meta-learner?

A

= ensemble methods; combines the predictions of multiple base-learners
the meta-learner takes the outputs of individual base-learners as input features and generates the final prediction.

Meta learning is a subfield of machine learning where automatic learning algorithms are applied to metadata

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Question 5
What is the conceptual similarity between kNN and ensembles, and between KNN and a decision tree?

A

kNN & ensembles similarity
- neighbours apporach & multiple model apporach
- aggregation of prediction (from K neigbour or multiple models)
- robustness through diversity aka diversity inindividual model

kNN & decision tree similarity
- captures non-linearity
- offer intrepretabilty
- can hanlde data w mixed data types

How well did you know this?
1
Not at all
2
3
4
5
Perfectly