Wronged Questions: Decision Trees Flashcards

1
Q

T/F: Small shrinkage parameter requires more iterations because it has a slower learning rate

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

T/F: Boosting can lead to overfitting if you have a high number of iterations

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

As K increases, flexibility (increases/decreases).

A

Decreases

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Classification rate, gini index, and entropy are (inappropriate/appropriate) when pruning a tree.

A

Appropriate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

T/F: Decision trees are easier to interpret than linear models.

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

T/F: Decision trees are more robust than linear models.

A

False. Decision trees are generally less robust than linear models; they can produce significantly different outcomes with small changes in the input data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

T/F: Decision trees handle qualitative predictors more easily than linear models.

A

True. Decision trees naturally handle qualitative (categorical) predictors without the need for preprocessing steps such as creating dummy variables, which are often required in linear models.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

T/F: In boosting, the number of terminal nodes in each tree is independent of the number of splits.

A

False. In boosting, the number of terminal nodes in each tree is directly related to the number of splits. The number of terminal nodes (leaves) in a tree is one more than the number of splits.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

T/F: Boosting does not allow for the adjustment of model complexity through the parameter.

A

False. Parameter d is specifically used to adjust the complexity of the model in boosting.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

T/F: Boosting considers only a random subset of predictors at each node in every tree.

A

False. In classical boosting algorithms, all available predictors are considered at each split, not a random subset.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

T/F: A smaller value of d in boosting necessitates a larger number of trees to adequately model the data.

A

True. A smaller implies simpler base learners (trees), which individually capture less of the data’s complexity. Therefore, more trees are needed to aggregate enough information to model the data effectively.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

T/F: Like bagging, boosting is a general approach that can be applied to many statistical learning methods for regression or classification.

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

T/F: Each tree is fit on a modified version of the bootstrapped samples for boosting.

A

False. Boosting does not involve bootstrap sampling; instead each tree is fit on a modified version of the original data set.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

T/F: Unlike fitting a single large decision tree to the data, which amounts to fitting the data hard and potentially overfitting, the boosting approach instead learns slowly.

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

T/F: In boosting, unlike in bagging, the construction of each tree depends strongly on the trees that have already been grown.

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

T/F: Like bagging, boosting involves combining a large number of decision trees.

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

T/F: Unlike bagging and random forests, boosting can overfit if B is too large.

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

T/F: Cross-validation is used to select B.

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

T/F: A very small value of the shrinkage parameter can require using a very large number of trees to achieve good performance.

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

T/F: An interaction depth of zero often works well and the boosted ensemble is fitting an additive model.

A

False. An interaction depth of one often works well and the boosted ensemble is fitting an additive model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

T/F: In boosting, because the growth of a particular tree takes into account the other trees that have already been grown, smaller trees are typically sufficient.

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

T/F: Individual trees in a random forest are left unpruned, contributing to the ensemble’s variance reduction despite their own overfitting.

A

True. In a random forest, individual trees are typically grown to their full depth without pruning, which might make them prone to overfitting. However, when these overfitted trees are aggregated, the ensemble model achieves a significant reduction in variance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

T/F: The combination of results from unpruned trees in a random forest leads to a reduction in the overall variance of the model.

A

True. Emphasizing the ensemble effect where the aggregation of multiple unpruned, overfitted trees results in a model with reduced overall variance, leveraging the strength of the ensemble to balance out individual tree overfitting.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

T/F: Increasing m leads to a higher degree of decorrelation between the trees, where m is the number of predictors chosen as split candidates at each split.

A

False. As a larger value of m tends to increase the correlation between trees. The parameter m is the number of predictors chosen as split candidates at each split.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

T/F: Random forests can effectively handle both regression and classification problems.

A

True. It acknowledges the versatility of random forests in addressing different types of prediction problems.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

T/F: A variable importance plot is a useful tool for identifying which predictors are most influential in a random forest model.

A

True. Variable importance plots are indeed utilized to identify the most important predictors in a random forest, offering insights into how different variables contribute to the predictive power of the model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

T/F: Bagging and random forests make use of bootstrapped samples in their algorithms, but boosting does not.

A

True. Bagging and random forests employ bootstrapped samples to build multiple decision trees, enhancing model accuracy and robustness by aggregating their predictions.

Conversely, boosting sequentially constructs trees, each focusing on correcting errors from previous ones, without utilizing bootstrapped samples to improve performance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

T/F: Boosting can overfit if the number of iterations is set too high, unlike bagging or random forests.

A

True. Boosting’s performance can be sensitive to the number of iterations, leading to potential overfitting.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

T/F: The optimal number of iterations in boosting is often determined through cross-validation.

A

True. Cross-validation is commonly used to select the optimal B in boosting to balance bias and variance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

T/F: For bagging and random forests, the choice of B is less critical to avoiding overfitting compared to boosting.

A

True. Bagging and random forests are generally robust against overfitting due to their aggregation methods, making the specific choice of B less critical.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

T/F: Pruning is a common strategy in both bagging and boosting to prevent overfitting.

A

False. Pruning is not used in either bagging or boosting as a strategy to prevent overfitting.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

T/F: Bagging significantly reduces variance by averaging multiple predictions.

A

True. One of the primary advantages of bagging is its ability to reduce the variance of complex models, like deep/large decision trees, by averaging the predictions of multiple bootstrapped models, which tends to make the ensemble prediction more robust than any single model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

T/F: Each bagged tree uses approximately one-third of observations from the original training set.

A

False. In bagging, each tree, on average, makes use of around two-thirds of the observations due to the nature of bootstrap sampling, where some observations are repeated, and others are left out.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

T/F: Bagging is exclusively effective for decision trees and cannot be applied to other statistical learning methods.

A

False. Bagging is a general-purpose procedure that can be applied to many types of statistical learning methods, not just decision trees, although it is particularly beneficial for models that exhibit high variance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

T/F: On average, (p-m)/p of the splits will not even consider the strong predictor.

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

T/F: The main difference between bagging and random forests is the choice of predictor subset size.

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

T/F: If a random forest is built using m = p^1/2, then this amounts to bagging.

A

False. If a random forest is built using m = p, then this amounts to bagging.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

T/F: Using a small value of m in building a random forest will typically be helpful when we have a large number of correlated predictors.

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

T/F: Random forests will not overfit if we increase B, so in practice we use a value of B sufficiently large for the error rate to have settled down.

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

T/F: An alpha value of zero results in the largest, unpruned tree.

A

True. An a value of zero implies no penalty on the tree’s complexity, and thus the tree grows to its largest size without any pruning. This results in the most complex tree possible.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

T/F: Increasing alpha leads to a decrease in the variance of the model.

A

True. As the tree becomes simpler with higher a
values, its variance decreases due to less model flexibility.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
42
Q

T/F: Increasing alpha leads to a decrease in the squared bias of the fitted tree.

A

False. Increasing alpha leads to an increase in the squared bias of the fitted tree. This is because increasing the value of a in cost complexity pruning penalizes the addition of splits to the tree, resulting in a simpler tree (lower variance).

A simpler tree is less flexible in fitting the data, which leads to an increase in the squared bias as the model becomes increasingly unable to capture the underlying patterns in the data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
43
Q

Three non-parametric statistical learning methods

A

KNN, Decision Trees, Bagging/Random Forest/Boosting

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
44
Q

T/F: To build each tree using random forests, a bootstrapped sample of n observations is used, and for each split within the tree, a new random selection of m predictors is made.

A

True. In random forests, each tree is built from a bootstrapped sample of the original dataset, containing n observations. At each split in the construction of a tree within a random forest, a random subset of m predictors is selected from all available predictors.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
45
Q

T/F: Out-of-bag estimation can be used to estimate the test error for random forests.

A

True. Out-of-bag estimation, which uses each observation’s predictions from trees where that observation was not in the bootstrap sample, provides an estimate of the test error without needing a separate test set.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
46
Q

T/F: Random forests reduce bias through the averaging of multiple decorrelated trees.

A

False. It is incorrect because the main benefit of random forests is variance reduction, not necessarily bias reduction. Random forests reduce variance by averaging the results of multiple decorrelated trees. While this ensemble method is effective at addressing overfitting and reducing variance, the reduction in bias is not guaranteed.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
47
Q

T/F: Pruning always decreases both training and test error rates.

A

False. Pruning does not always decrease both error rates.

48
Q

T/F: Pruning reduces the training error rate but may increase the test error rate.

A

False. Pruning typically increases the training error rate due to less perfect fitting to the training data, the impact on the test error rate can be mixed.

49
Q

T/F: Pruning increases the training error rate but has a predictable effect on reducing the test error rate.

A

False. There is usually not a predictable effect on the test error rate.

50
Q

T/F: Pruning tends to increase the training error rate, but its effect on the test error rate can vary.

A

True. Pruning a classification tree involves removing splits that provide less generalizable decision-making ability, leading to a simpler model. While this process makes the tree less flexible and typically increases the training error rate due to less perfect fitting to the training data, the impact on the test error rate can be mixed.

51
Q

T/F: Pruning a classification tree has no impact on the training error rate but tends to decrease the test error rate.

A

False. Pruning a classification tree increases the training error rate. We still do it to prevent overfitting.

52
Q

T/F: Bagging increases the interpretability of a model.

A

False. Bagging does not make the model more interpretable; in fact, it makes it less interpretable.

53
Q

T/F: Out-of-bag (OOB) error estimation requires each observation to be predicted by all trees in the ensemble.

A

False. OOB error estimation uses only the trees for which the specific observation was not in the bootstrap sample.

54
Q

T/F: A very high number of bootstrap samples will inevitably lead to overfitting in bagged models.

A

False. Increasing the number of trees does not lead to overfitting due to the aggregation of predictions.

55
Q

T/F: Bagging cannot improve prediction accuracy in classification settings.

A

False. Bagging is indeed useful for improving prediction accuracy in classification settings.

56
Q

T/F: For a sufficiently large number of bootstrap samples, out-of-bag error is virtually equivalent to leave-one-out-cross-validation error.

A

True. For a sufficiently large number of trees, out-of-bag error is virtually equivalent to leave-one-out cross-validation error. OOB error estimation, which uses each tree’s predictions on observations not included in its bootstrap sample, provides a reliable estimate of the model’s performance that becomes increasingly accurate as the number of bootstrap samples increases.

57
Q

T/F: Each bootstrapped dataset in bagging contains a completely different set of observations from the original dataset.

A

False. Each bootstrapped dataset likely contains repeated observations due to sampling with replacement, and not all observations will be different.

58
Q

T/F: First step in building a regression tree: Use recursive binary splitting to grow a large tree on the training data, stopping only when each terminal node has fewer than some minimum number of observations.

A

True

59
Q

T/F: Second step in building a regression tree, apply cost complexity pruning to the large tree in order to obtain a sequence of best subtrees, as a function of a nonnegative tuning parameter.

A

True

60
Q

T/F: Third step in building a regression tree, utilize k-fold cross-validation to select the tuning parameter α by evaluating the mean squared prediction error on the data in the left-out kth fold, as a function of k.

A

False. utilize k-fold cross-validation to select the tuning parameter α by evaluating the mean squared prediction error on the data in the left-out kth fold, as a function of α.

61
Q

T/F: 4th step in building a regression tree, average the test errors from the k-fold validation for each value of the tuning parameter alpha, and pick alpha that minimizes the average error.

A

True

62
Q

T/F: The final step is to return to the full data set and obtain the subtree corresponding to alpha that minimizes the average error.

A

True

63
Q

Three examples of greedy algorithms

A

Recursive binary splitting, forward stepwise selection, backward stepwise selection

64
Q

T/F: Single decision tree models generally have higher variance than random forest models.

A

True. Random forest models average the result of each individual tree to obtain a single prediction for each observation. The action of averaging reduces the variance.

65
Q

T/F: Random forests provide an improvement over bagging because predictions from the trees in a random forest are less correlated than those in bagged trees.

A

True. Random forests consider a random subset of predictors at each split to decorrelate trees.

66
Q

T/F: Classification error is sufficiently sensitive for tree-growing.

A

False. It’s sufficiently sensitive for pruning but not tree growing :(

67
Q

T/F: The Gini index is a measure of total variance across the K classes.

A

True

68
Q

T/F: The Gini index takes on a small value only if all of the Pmk’s are all near zero.

A

False. The Gini index takes on a small value if all of the
Pmk’s are all near zero or near one. For this reason the Gini index is referred to as a measure of node purity—a small value indicates that a node contains predominantly observations from a single class.

69
Q

T/F: The entropy will take on a value near zero only if the Pmk’s are all near zero.

A

False. The entropy takes on a small value if all of the
Pmk’s are all near zero or near one. For this reason the Gini index is referred to as a measure of node purity—a small value indicates that a node contains predominantly observations from a single class.

70
Q

T/F: Gini index and the entropy are quite different numerically.

A

False. Gini index and the entropy are quite similar numerically.

71
Q

T/F: Bagging is a procedure used to reduce bias of a statistical learning method.

A

False. Bagging is a procedure used to reduce variance of a statistical learning method.

72
Q

T/F: Pruning is a technique applied in the construction of bagging models.

A

False. Pruning is not typically used in bagging, random forests, or boosting as part of their standard methodologies.

73
Q

T/F: Random forests utilize pruning to reduce the complexity of individual trees.

A

False.

74
Q

T/F: Boosting models are pruned to prevent overfitting.

A

False. The complexity control is achieved through parameters like the number of trees (B), the learning rate, and the depth of the trees (d) rather than pruning.

75
Q

T/F: The out-of-bag error is a valid estimate of the test error for a bagged model.

A

True.

76
Q

T/F: In bagging, the bagged trees are grown deep and then pruned.

A

False. In bagging, the bagged trees are grown deep and not pruned. The idea is to produce trees that each has a high variance but a low bias. Averaging the trees reduces the variance.

77
Q

T/F: Random forests are better than bagging at thoroughly exploring the model space.

A

True. In bagging, because each bagged tree is trained independently, the tree may converge to one of the suboptimal solutions (local optima) that are not the best overall solution for the entire ensemble.

78
Q

T/F: In random forests, fitting a very large number of trees will not lead to overfitting.

A

True. It is also true for bagging. After a sufficient number of trees, the test error will settle down and flatline.

79
Q

T/F: Boosting generally requires growing smaller trees than random forests do.

A

True. This is because, in boosting, the growth of a particular tree takes into account the other trees that have been grown, unlike in random forests where the trees are grown independently of each other.

80
Q

T/F: Cost complexity pruning is a way to select a small set of subtrees for consideration, also known as weakest link pruning.

A

True. Weakest link pruning is a method used to select a limited number of subtrees from a larger set of possibilities. This approach focuses on identifying and pruning the weakest links or least important branches in the decision tree, resulting in a simpler and more interpretable model.

81
Q

T/F: Rather than considering every possible subtree, this method (decision trees?) considers a sequence of trees indexed by a nonnegative tuning parameter α.

A

True. Instead of exhaustively considering every potential subtree, weakest link pruning involves evaluating a sequence of trees indexed by a nonnegative tuning parameter α. This parameter controls the complexity of the tree, allowing for the exploration of different tree sizes and structures.

82
Q

T/F: As the tuning parameter α value increases, there is a price to pay for having a tree with many terminal nodes, so the error sum of squares plus the number of terminal nodes will be minimized for a smaller subtree.

A

False. As the tuning parameter increases, there is a price to pay for having a tree with many terminal nodes, and so the RSS plus the number of terminal nodes times the tuning parameter α will tend to be minimized for a smaller subtree.

83
Q

T/F: One difference between boosting and random forest is that in boosting, smaller trees are typically not sufficient.

A

False. In boosting, because the growth of a particular tree takes into account the other trees that have already been grown, smaller trees are typically sufficient.

84
Q

T/F: Often d=1 works well in boosting.

A

True. Often d=1 works well, in which case each tree is a stump, consisting of a single split. In this case, the boosted ensemble is fitting an additive model, since each term involves only a single variable.

85
Q

T/F: In a bagged model, an average of about one-third of the observations are used to train each bagged tree.

A

False. In a bagged model, an average of about 2/3 of the observations are used to train each bagged tree.

86
Q

T/F: For random forests, the more predictors are considered at each split, the greater the decorrelation benefit.

A

False. For random forests, the fewer predictors are considered at each split, the greater the decorrelation benefit.

87
Q

T/F: If the training dataset has a strong predictor and a few moderately strong predictors, a random forest model offers greater variance reduction than a bagged model.

A

True. This is due to the random forest model only considering a random subset of predictors at each split, preventing a strong predictor from dominating the top split and producing similar bagged trees that result in limited variance reduction.

88
Q

T/F: In a boosted model, the shrinkage parameter determines the number of splits in each tree.

A

False. In a boosted model, the interaction depth determines the number of splits in each tree.

89
Q

T/F: In a boosted model, each tree is independent from the previous tree.

A

False. In a boosted model, the trees are grown sequentially, with each tree fitted to the residuals of the previous tree, which creates dependency between the trees.

90
Q

T/F: The classification error rate is a measure directly related to the most commonly occurring class in a node.

A

True

91
Q

T/F: Node purity increases are always accompanied by decreases in the classification error rate after a split.

A

False

92
Q

T/F: Boosting reduces bias

A

True

93
Q

T/F: To apply bagging to regression trees, we simply construct B regression trees using a single bootstrapped training set, and average the resulting predictions.

A

False. To apply bagging to regression trees, we simply construct B regression trees using α multiple bootstrapped training sets, and average the resulting predictions.

94
Q

T/F: Regression trees are grown deep, and are not pruned for bagging.

A

True

95
Q

T/F: Each individual bagged tree has high variance, but low bias.

A

True

96
Q

T/F: Averaging these B (bagged) trees reduces the variance.

A

True

97
Q

T/F: Bagging has been demonstrated to give impressive improvements in accuracy by combining together hundreds or even thousands of trees into a single procedure.

A

True

98
Q

T/F: To obtain a single prediction for the i-th observation, we average the predicted responses obtained using each of the trees where the observation is OOB if regression is the goal or take a majority vote if classification is the goal.

A

True

99
Q

T/F: An OOB prediction can be obtained for each of the entire set of n observations, from which the overall OOB MSE for a regression problem or classification error for a classification problem can be computed.

A

True

100
Q

T/F: Recursive binary splitting divides the predictor space into non-overlapping regions.

A

True. Recursive binary splitting creates a hierarchical partitioning of the predictor space, where each split further subdivides the predictor space into smaller regions.

101
Q

T/F: Recursive binary splitting can make non-orthogonal splits or splits that are not aligned with the axes of the predictor space.

A

False. Recursive binary splitting can only make orthogonal splits or splits that are aligned with the axes of the predictor space.

102
Q

T/F: Setting a strict stopping criterion to build a small tree is preferable to building a large tree and pruning it back.

A

False. Building a large tree and pruning it back is preferable to setting a strict stopping criterion to build a small tree.

103
Q

T/F: Cost complexity pruning produces a series of subtrees, as a function of the tuning parameter α, that may or may not be nested.

A

False. Cost complexity pruning produces a sequence of nested subtrees.

104
Q

T/F: It is not possible for cross-validation to select a stump as the best subtree.

A

False. If cross-validation chooses a tuning parameter value that corresponds to a stump, then the stump is the best subtree.

105
Q

In a classification tree, a split at the bottom of the tree creates two terminal nodes that have the same predicted value.

T/F: Both terminal nodes are pure.

A

False. In order for a split to produce two pure terminal nodes with the same predicted value, the node must be pure before the split, in which case the split would not happen as there would be no node purity improvement to be made.

106
Q

In a classification tree, a split at the bottom of the tree creates two terminal nodes that have the same predicted value.

T/F: The split could increase confidence in the predicted value.

A

True. The increased node purity that results from this split increases our confidence in the predicted value, especially if a test observation belongs to the purer of the two terminal nodes.

107
Q

In a classification tree, a split at the bottom of the tree creates two terminal nodes that have the same predicted value.

T/F: The split decreases the classification error rate.

A

False. It does not decrease the classification error rate.

108
Q

In a classification tree, a split at the bottom of the tree creates two terminal nodes that have the same predicted value.

T/F: The split does not decrease the Gini index.

A

False. The split does decrease the Gini index.

109
Q

In a classification tree, a split at the bottom of the tree creates two terminal nodes that have the same predicted value.

T/F: The split does not decrease the entropy.

A

False. The split does decrease the entropy.

110
Q

T/F: Decision trees generally have better predictive accuracy compared to other statistical methods.

A

False. They generally have worse predictive accuracy compared to other statistical methods.

111
Q

T/F: Decision trees provide accurate in-sample testing results, but exhibit poor performance in out-of-sample testing.

A

True. Decision trees are prone to overfitting, especially when the trees are grown deep. In cases like this, they fit the training data well, but do not fit the test data well.

112
Q

T/F: It is difficult to measure variable importance in decision trees.

A

False. Identifying important variables in decision trees is straightforward, as these variables typically appear at the top of the tree.

113
Q

T/F: Pruning a tree increases its residual sum of squares.

A

True

114
Q

T/F: When using cost complexity pruning, the penalty decreases as the number of terminal nodes in the subtree increases.

A

False. The penalty increases as the number of terminal nodes in the subtree increases.

115
Q

T/F: A decision tree considers all predictors X1, … Xp, and all possible values of the cutpoint for each of the predictors, and then chooses the predictor and cutpoint such that the resulting tree has the lowest residual sum of squares.

A

True

116
Q

T/F: The difference between random forests and bagging is that decorrelated trees are utilized in random forests.

A

True. When building trees in bagging, a full set of predictors is considered at each split. However, only a subset of predictors was considered at each split in random forests. This makes trees less correlated and the average of the resulting trees more reliable.

117
Q

T/F: The best hyperparameters are usually possible to determine beforehand.

A

False.