Chapter 7: Ensemble Learning and Random Forests Flashcards
What is a random forest?
A random forest is an ensemble method in machine learning. It works by:
- Training multiple decision tree classifiers on different random subsets of the training data.
- Obtaining predicted class from all the trees.
- The class with the most votes is the ensembles prediction.
In ensemble learning, what is meant by a Hard Voting Classifier?
- Predictions from an ensemble of classifiers are aggregated.
- The class which is outputted by the majority of classifiers is taken as the ensembles choice of class.
For example, say an ensemble of 4 classifiers are asked to predict the risk of a bank transaction, and 3 out of the 4 classify it as high-risk, then this is the majority vote and will be taken as the ensembles prediction.
Theoretically, what are the conditions for ensembles to produce the highest accuracy?
Conditions:
- All classifiers are perfectly independent.
- All classifiers make uncorrelated errors.
Why are the theoretical conditions for optimal ensemble predictions never met in practice?
Ensemble models are trained on the same data so will make similar errors.
How can we reduce the similarity between the errors made by models in the ensemble?
Use very different types of algorithms in the ensemble.
In practice, how can you create a VotingClassifier in python?
- Use Scikit Learns VotingClassifier method.
What is soft voting in ensemble learning?
When using classifiers that predict class probabilities, soft voting averages the probabilities from all classifiers and the ensemble prediction is the class with the highest average probability.
Why can soft voting outperform hard voting?
Soft voting can give more weight to highly confidence votes, where as hard voting can only take the majority class.
What is bagging in ensemble learning?
Bagging (short for bootstrap aggregation) refers to the method of ensemble learning where the ensemble is created using one type algorithm and multiple samples of the training data to create multiple predictors.
- The training algorithm is trained on random subsets of the data in the training set.
- Each random subset creates one predictor algorithm.
- The random sampling of the dataset is performed with replacement.
What is pasting?
Pasting is the same as bagging but the training data is randomly sampled without replacement.
What is the ensemble prediction from bagging or pasting?
The aggregate of predictions from all of the predictors:
- Statistical mode for classification.
- Statistical mean for regression.
What is the benefit of using bagging or pasting?
- The individual predictors have higher bias than if they were trained on the full dataset.
- Aggregation reduces bias and the variance.
- The end result typically has lower bias and variance than a single predictor trained on the full training set.
What is the Random Patches method in bagging?
The Random Patches method refers to sampling both the training instances (random subsets of the training data) and randomly sampling which features to use from the dataset.
What is the Random Subspaces method?
This is similar to Random Patches, however only the features are randomly sampled, the training instances are kept the same.
What does Boosting refer to?
Boosting refers to an ensemble method that can combine several weak learners into a strong learner. The general idea of most boosting algorithms is to train successive predictors and try to correct its predecessor.