Wronged Questions: Decision Trees Flashcards
T/F: Small shrinkage parameter requires more iterations because it has a slower learning rate
True
T/F: Boosting can lead to overfitting if you have a high number of iterations
True
As K increases, flexibility (increases/decreases).
Decreases
Classification rate, gini index, and entropy are (inappropriate/appropriate) when pruning a tree.
Appropriate
T/F: Decision trees are easier to interpret than linear models.
True
T/F: Decision trees are more robust than linear models.
False. Decision trees are generally less robust than linear models; they can produce significantly different outcomes with small changes in the input data.
T/F: Decision trees handle qualitative predictors more easily than linear models.
True. Decision trees naturally handle qualitative (categorical) predictors without the need for preprocessing steps such as creating dummy variables, which are often required in linear models.
T/F: In boosting, the number of terminal nodes in each tree is independent of the number of splits.
False. In boosting, the number of terminal nodes in each tree is directly related to the number of splits. The number of terminal nodes (leaves) in a tree is one more than the number of splits.
T/F: Boosting does not allow for the adjustment of model complexity through the parameter.
False. Parameter d is specifically used to adjust the complexity of the model in boosting.
T/F: Boosting considers only a random subset of predictors at each node in every tree.
False. In classical boosting algorithms, all available predictors are considered at each split, not a random subset.
T/F: A smaller value of d in boosting necessitates a larger number of trees to adequately model the data.
True. A smaller implies simpler base learners (trees), which individually capture less of the data’s complexity. Therefore, more trees are needed to aggregate enough information to model the data effectively.
T/F: Like bagging, boosting is a general approach that can be applied to many statistical learning methods for regression or classification.
True
T/F: Each tree is fit on a modified version of the bootstrapped samples for boosting.
False. Boosting does not involve bootstrap sampling; instead each tree is fit on a modified version of the original data set.
T/F: Unlike fitting a single large decision tree to the data, which amounts to fitting the data hard and potentially overfitting, the boosting approach instead learns slowly.
True
T/F: In boosting, unlike in bagging, the construction of each tree depends strongly on the trees that have already been grown.
True
T/F: Like bagging, boosting involves combining a large number of decision trees.
True
T/F: Unlike bagging and random forests, boosting can overfit if B is too large.
True
T/F: Cross-validation is used to select B.
True
T/F: A very small value of the shrinkage parameter can require using a very large number of trees to achieve good performance.
True
T/F: An interaction depth of zero often works well and the boosted ensemble is fitting an additive model.
False. An interaction depth of one often works well and the boosted ensemble is fitting an additive model.
T/F: In boosting, because the growth of a particular tree takes into account the other trees that have already been grown, smaller trees are typically sufficient.
True
T/F: Individual trees in a random forest are left unpruned, contributing to the ensemble’s variance reduction despite their own overfitting.
True. In a random forest, individual trees are typically grown to their full depth without pruning, which might make them prone to overfitting. However, when these overfitted trees are aggregated, the ensemble model achieves a significant reduction in variance.
T/F: The combination of results from unpruned trees in a random forest leads to a reduction in the overall variance of the model.
True. Emphasizing the ensemble effect where the aggregation of multiple unpruned, overfitted trees results in a model with reduced overall variance, leveraging the strength of the ensemble to balance out individual tree overfitting.
T/F: Increasing m leads to a higher degree of decorrelation between the trees, where m is the number of predictors chosen as split candidates at each split.
False. As a larger value of m tends to increase the correlation between trees. The parameter m is the number of predictors chosen as split candidates at each split.