SUL Topic 5b - Random Forest Flashcards

Question 1

Q

Ensemble Models

Answer

A

Aggregation of multiple models where final prediction combines component model predictions

Question 2

Q

Random Forest

Answer

A

Ensemble learning method combining multiple decision trees for improved accuracy and stability

Question 3

Q

Decision Tree Limitations

Answer

A

Overcome by random forests’ combination of simplicity and flexibility

Question 4

Q

Bagging

Answer

A

Technique using bootstrap data sets and aggregating predictions to enhance predictive power

Question 5

Q

Out-of-Bag Error Rate

Answer

A

Metric for assessing random forest accuracy and generalization to unseen data

Question 6

Q

Hyperparameter Tuning

Answer

A

Process of optimizing model performance by adjusting variables considered at each step

Question 7

Q

Continuous Improvement

Answer

A

Ongoing refinement of random forest techniques to expand capabilities and applications

Question 8

Q

Forest Algorithm

Answer

A

Sampling of rows and columns at each step, producing more diverse trees than bagging

Question 9

Q

Advantages of Random Forests

Answer

A

Automatic handling of missing values
Better prediction
Natural safeguard against overfitting

Question 10

Q

Bootstrap Sample

Answer

A

Training data used to construct an individual tree in a random forest

Question 11

Q

Out-of-Bag Sample

Answer

A

Training data excluded during the construction of an individual tree

Question 12

Q

Variable Reduction

Answer

A

Automatic process in random forests that requires less data preparation

Question 13

Q

Honest Assessment

Answer

A

Natural form provided by sampling and bagging in forests

Question 14

Q

Interpretation Challenge

Answer

A

Random forests are difficult to interpret but serve as an ideal model for comparison

Question 15

Q

Iterative Testing

Answer

A

Process of adjusting settings to select the most accurate random forest model

Question 16

Q

Missing Data Handling

Answer

A

Capability of random forests to automatically address incomplete datasets

Question 17

Q

Clustering in Random Forests

Answer

A

Technique explored to expand model capabilities and applications

Question 18

Q

Predictive Accuracy

Answer

A

Often improved in random forests due to diverse tree ensembles

Question 19

Q

Model Comparison

Answer

A

Random forests serve as a benchmark for other models’ performance

Question 20

Q

Validation Data Assessment

Answer

A

Recommended practice in large data scenarios despite forest safeguards