Machine Learning with Viya® 3.4® Lesson 3: Decision Trees and Ensembles of Trees Flashcards

1
Q

What is a greedy algorithm?

A

one that makes locally optimal choices at each step

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How does a decision tree predict cases?

A

decision trees use rules that involve the values or categories of the input variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is a decision tree referred to as when the target is categorical?

A

A classification tree

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Name the first node at the base (top) of the tree

A

root node

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is a decision tree referred to as when the target is continuous?

A

A regression tree

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is a leaf node?

A

a node with only one connection

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Which component of a decision tree provides the predictions?

A

A tree’s leaf nodes provide the predictions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How do decision trees address the curse of dimensionality?

A

The split search process reduces the number of inputs in the model by eliminating irrelevant inputs. Irrelevant inputs do not appear in any splitting rules in the decision tree.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How does a decision tree handle missing values?

A

The split search criteria for decision trees assign the missing values along one side of a branch at the Splitting node as a category.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

The input variables have missing values. What should you do before running a Decision Tree node with these input variables?

A

Nothing. There is no need to impute any missing values because trees can handle them

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What does Model Studio display in the Tree Diagram?

A

the final tree structure for this particular model, such as the depth of the tree and all end leaves

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is a reduction in node impurity?

A

the reduction of within-node variability induced by the split

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is a surrogate rule?

A

A surrogate splitting rule is a backup to the main splitting rule.

When surrogate rules are requested, if a new case has a missing value on the splitting variable, then the best surrogate is used to classify the case.

If several surrogate rules exist, each surrogate is considered in sequence until one can be applied to the observation.

If none can be applied, the main rule assigns the observation to the branch that is designated for missing values.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How do you interpret a Gini index?

A

The Gini index can be interpreted as the probability that any two elements of a group, chosen at random (with replacement), are different.

A pure node (with no diversity) has a Gini index of 0. As the number of evenly distributed classes increases, the Gini index approaches 1 (more diverse, less pure.)

If we randomly select two observations from a group, the Gini index is the percentage chance that two observations will be different from each other

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is a Gini index?

A

a Gini index is a measure of variability for categorical data that can be used as a measure of node impurity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Which plot shows a Decision Tree model’s performance based on the misclassification rate?

A

the pruning error plot

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What does the Cumulative Lift chart in the Assessment tab show?

A

how much better the model is than no model / random events

the model’s performance ordered by the percentage of the population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

How can you set the maximum number of generations in nodes for a decision tree in Model Studio?

A

Expand the Splitting Options properties and set the Maximum Depth

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Where would you evaluate model performance based on an assessment measure such as average squared error?

A

the fit statistics table

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Where would you look to see the input variables that are most significant to the final model?

A

the Variable Importance table

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is the standard method used to fit decision trees?

A

Recursive partitioning

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Allowing a larger tree to be grown by increasing the maximum depth could lead to what problem?

A

overfitting

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What setting can you change to help prevent overfitting?

A

Increase the minimum leaf size

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What is the response of the ensemble of simple decision trees for an interval target?

A

For an interval target, the response of the ensemble model is the average of the estimate of the individual decision trees.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

What is the response of the ensemble of simple decision trees for a categorical target?

A

For a categorical target, the response of the ensemble of simple decision trees is the vote for the most popular class or the average of the posterior probabilities of the individual trees.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

What is bagging?

A

Bagging takes bootstrap samples of the rows of training data. All columns are considered for splitting at every step.

27
Q

What is a random forest?

A

A forest is an ensemble of simple (classification or regression) decision trees

28
Q

How does training different trees with different training data improve predictions for a forest?

A

Training different trees with different training data reduces the correlation of the predictions of the trees

29
Q

What is an out-of-bag sample?

A

the training data that are excluded during the construction of an individual tree

30
Q

What data is used to assess the fit of a forest model?

A

the out-of-bag sample

31
Q

How does Model Studio calculate the maximum number of inputs per split in a Forest Model when using the default settings?

A

By default, the number of inputs considered per split is the square root of the number of inputs

32
Q

How does the forest algorithm sample the data?

A

The forest algorithm samples the rows and the columns at each step (leading to more perturbed data than the bagging algorithm)

33
Q

What additional chart is available when the target is binary?

A

the ROC curve

34
Q

What does the ROC curve show?

A

the model’s performance considering the true positive rate and the false positive rate

35
Q

How does a split-search strategy work?

A
  1. Identify candidate splits based on the splitting criterion
  2. Select a split that is expressed as an IF-THEN-ELSE rule
  3. Repeat process for each child node, continuing until a stopping rule prevents further growth
36
Q

What is the goal of splitting?

A

to reduce the variability of the target distribution and thus increase purity in the child nodes

37
Q

What is a split search?

A

an iterative process used by recursive partitioning to select the best split for the node

38
Q

Which splitting criteria may be used for categorical targets?

A
  1. Information gain ratio (IGR) (default in Model Studio)
  2. CHAID
  3. Chi-Square
  4. Entropy
  5. GINI
39
Q

Which splitting criteria are appropriate for interval targets?

A
  1. Variance (default in Model Studio)
  2. CHAID
  3. Ftest
40
Q

What is the purpose of the Bonferroni correction during a decision tree split search?

A

To maintain overall confidence by inflating the p-values.

41
Q

Which split criteria can request a Bonferroni correction after the split has been determined?

A

Split criteria using the p-value (chi-square, CHAID, or F test)

42
Q

Which window shows the score code for a specific node that may be deployed in production?

A

the Node Score Code window

43
Q

When does Model Studio generate node score code?

A

Model Studio generates node score code for every node in the Data Mining Preprocessing group and the Supervised Learning group that creates DATA step score code.

44
Q

What is another name for the “flow score code?”

A

Path EP Score Code

45
Q

What is included in the Path EP Score Code?

A

score code for all the nodes until and including that modeling node to be used in other SAS environments

46
Q

What does the ‘EP’ refer to in the term Path EP Score Code?

A

Embedded Process

47
Q

Which window contains the SAS training code that may be used to train the model based on different data sets or platforms?

A

The Training Code window

48
Q

What do large values of the F statistic indicate?

A

departures from the null hypothesis that all the node means are equal

49
Q

What does the between-node sum of squares (SSbetween) measure?

A

the distance between the node means and the overall mean

50
Q

What does the within-node sum of squares (SSwithin) measure?

A

the variability within a node

51
Q

The FTEST splitting criteria is appropriate for what type of target?

A

interval

52
Q

How does Model Studio use ENTROPY as a splitting criterion?

A

ENTROPY uses the gain in the information or the decrease in entropy to split each variable and then to determine the split

53
Q

What do the letters in the acronym CHAID represent?

A

chi-squared automatic interaction detection

54
Q

What value does CHAID use for a classification tree?

A

CHAID uses the value of a chi-square statistic for a classification tree

55
Q

What value does the CHAID algorithm use as a splitting criterion for a regression tree?

A

CHAID uses the F statistic as a splitting criterion for a regression tree

56
Q

Which grow criterion can be used for both interval and categorical target variables?

A

CHAID

57
Q

How does the CHISQUARE splitting criteria method work?

A

CHISQUARE uses a chi-square statistic (logworth) to split each variable, and then uses the p-values that correspond to the resulting splits to determine the splitting variable.

58
Q

How does Model Studio use GINI as a splitting criterion in a Decision Tree node?

A

GINI uses the decrease in the Gini index to split each variable and then to determine the split

59
Q

How does Model Studio use IGR as a splitting criterion in a Decision Tree node?

A

Uses the entropy metric to split each variable and then uses the information gain ratio to determine the split

60
Q

Which splitting criteria is the default for a categorical target in Model Studio?

A

Information Gain Ratio (IGR)

61
Q

The Information gain ratio (IGR) splitting criteria is appropriate for what type of target?

A

categorical

62
Q

Which splitting criteria is the default for an interval target in Model Studio?

A

VARIANCE

63
Q

How does Model Studio use VARIANCE as a splitting criterion in a Decision Tree node?

A

VARIANCE uses the change in response variance to split each variable and then to determine the split

64
Q

The FTEST splitting criteria is appropriate for what type of target?

A

categorical