Information-Based Learning Flashcards by Mine Cronje

What are four strengths of decision trees?

easy to understand
no need to standardize data
can work with missing values
can work with outliers (leaf nodes will be pruned)

How well did you know this?

Not at all

Perfectly

What are two weaknesses of decision trees?

inductive bias (can only create axis parallel boundaries to separate classes).
tend to overfit (bad generalization performance on unseen data)

How well did you know this?

Not at all

Perfectly

What is meant by inductive bias?

Inductive bias comprises the assumptions a learning algorithm uses to generalize from training data to unseen instances, guiding the selection among multiple consistent hypotheses.

How well did you know this?

Not at all

Perfectly

True or False: In general, oblique trees result in more overfitted trees than non-oblique trees, when both trees are trained on the same training set.

True.

How well did you know this?

Not at all

Perfectly

True or False: Regression trees can use Gini index as a measure of homogeneity, such that a Gini index of 0 represents complete homogeneity.

False.
Regression trees use MSE or SDR, rather than the Gini index.
The Gini index is generally used in classification trees to measure impurity based on class distributions.

How well did you know this?

Not at all

Perfectly

If half, half, and none of the instances in a dataset (D) are of classes A, B, and C, respectively, then the entropy is

How well did you know this?

Not at all

Perfectly

A balanced decision tree induced from a dataset with three Boolean descriptive features (with no other descriptive features) and one categorical target feature, can have a maximum of how many leaf nodes in total?

How well did you know this?

Not at all

Perfectly

True or False: A classification tree that is applied to a problem with two classes can achieve a maximum entropy information gain of 0.5 at any node in the tree.

False.

How well did you know this?

Not at all

Perfectly

True or False: When inducing a classification tree using the information gain metric, the feature that yields the larges reduction in node impurity is selected for partitioning the dataset.

True

How well did you know this?

Not at all

Perfectly

True or False: Rules extracted from a decision tree are symmetric, i.e., the rule antecedent implies the consequent/ decision and vice versa.

False.

How well did you know this?

Not at all

Perfectly

True or False: Classification trees are more ideal than regression trees for predicting the probability of it raining tomorrow, since this probability lies between 0 and 1.

False.

How well did you know this?

Not at all

Perfectly

True or False: The number of tree levels in a model tree are generally fewer than the number of tree levels in an ordinary regression tree.

True.

How well did you know this?

Not at all

Perfectly

True or False: When converting a numerical feature into a categorical feature with 5 categories for use in a decision tree root node split, the size (in terms of number of instances) of all the child nodes of the root after splitting on the categorical feature, is the same regardless of whether equal width or equal width or equal frequency binning is used.

False.

How well did you know this?

Not at all

Perfectly

True or False: In general, during tree induction, entropy is more sensitive to outliers than Gini index.

True.

How well did you know this?

Not at all

Perfectly

Complete: When using a tree, one should use the ____ performance instead of the _____ performance to decide which metric should be used to determine split criteria.

validation
testing

How well did you know this?

Not at all

Perfectly

True or False: During tree induction, nodes will always have a non-negative entropy, but the information gain is always non-negative.

True.

True or False: During tree induction, nodes will always have values between 0 and 1 for both the Gini Index and information gain ratio (using entropy).

True.

True: Pre-pruning and post-pruning can be done on both classification trees and regression trees.

True.

True or False: When comparing the rules extracted from a non-oblique tree against the rules extracted from an oblique tree, the non-oblique tree, in general, produces more rules when both trees are trained on the same training set.

True.

True or False: If the data modelled only has continuous descriptive features, the target must also be continuous.

False.

True or False: Decision trees are said to overfit if the accuracy they produce on the data they are trained on, is significantly higher than the accuracy they produce on data they were not trained on.

True.

True or False: Decision trees are said to underfit if the error rate they produce on the data they were not trained on, is relatively as high as the error rate they produce on data they were trained on.

True.

True or False: If the data modelled by a decision tree only has categorical descriptive features, the target feature must also be categorical.

False.

True or False: In general, model trees are less sensitive to outliers in the training set than ordinary regression trees, when both trees are trained on the same training set.

True.