Decision Trees Flashcards
What is ensembles?
Ensembles are techniques that use multiple machine learning methods to create more powerful models. There are two categories for these: random forests and gradient boosted decision trees.
What is random forests?
Random forests are an ensemble method in order to overcome the overhitting disadvantage of decision trees. Many different decision trees are produced which over-fit in their unique way, Combining the above you have an overall model that avoids in some degree ovefitting.
Why random trees are called random?
The aim of this ensemble method is to produce UNIQUE trees. Somehow, we should provide a randomness propriety among them. This takes place by choosing a different portion of the dataset or selecting a different feature to split
When you need to classify a point, how the classification takes place in the random forest trees?
For random forests the following procedure takes place when it comes to predict values in classification cases. Each test data point is traversed on the whole tree and a soft probability is assigned for each class in the dataset. This is averaged across the probabilities of all the trees.