SUL Topic 5b - Random Forest Flashcards
Ensemble Models
Aggregation of multiple models where final prediction combines component model predictions
Random Forest
Ensemble learning method combining multiple decision trees for improved accuracy and stability
Decision Tree Limitations
Overcome by random forests’ combination of simplicity and flexibility
Bagging
Technique using bootstrap data sets and aggregating predictions to enhance predictive power
Out-of-Bag Error Rate
Metric for assessing random forest accuracy and generalization to unseen data
Hyperparameter Tuning
Process of optimizing model performance by adjusting variables considered at each step
Continuous Improvement
Ongoing refinement of random forest techniques to expand capabilities and applications
Forest Algorithm
Sampling of rows and columns at each step, producing more diverse trees than bagging
Advantages of Random Forests
Automatic handling of missing values
Better prediction
Natural safeguard against overfitting
Bootstrap Sample
Training data used to construct an individual tree in a random forest
Out-of-Bag Sample
Training data excluded during the construction of an individual tree
Variable Reduction
Automatic process in random forests that requires less data preparation
Honest Assessment
Natural form provided by sampling and bagging in forests
Interpretation Challenge
Random forests are difficult to interpret but serve as an ideal model for comparison
Iterative Testing
Process of adjusting settings to select the most accurate random forest model