Decision Trees Flashcards

1
Q

True or false: decision trees are a non-parametric alternative to regression

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How do decision trees work?

A

They split predictors into regions and then assign the average value of the region in the regression setting and the most common value in the classification setting.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What algorithm is used to grow the trees and how does it work?

A

Recursive Binary Splitting
It selects a binary split that minimizes the MSE. The algorithm is greedy - it only optimizes the current split. The algorithm continues until the number of observations in a region is below a specified number.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Why do we prune the tree?

A

The resulting tree from recursibe binary splitting is probably too big: more splits means more flexibility, lower biais and higher variance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

True or false: there is no optimal number of splits that minimizes MSE?

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are the two pruning methods?

A
  1. Cost complexity pruning

2. Weakest link pruning

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the tuning parameter (alpha)?

A

The cost of a tree per terminal nods.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Wow is the tuning parameter selected?

A

Cross-validation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

In a classification tree, what measure is used instead of the MSE as a number to minimize?

A

The classification error rate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

For tree growing, why can’t we use the classification error rate and what can we use instead?

A

Not sensitive enough

Gini index or Cross-entropy

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

True or false: the Gini index is the variance of observations?

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

In a classification tree, what measures can be used for:

  1. Pruning the tree
  2. Splitting the tree
A
  1. Classification error rate

2. Gini index or cross-entropy

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are the advantages of decision trees over linear regression?

A
  1. Easier to explain
  2. Closer to the way human decisions are made
  3. Tree can be graphed, making it easier to interpret
  4. Easier to handle categorical predictors (linear regression requires dummy variables)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are the decision tree’s shortcomings?

A
  1. Do not predict as well as linear regression

2. Not robust (small change in the input data can have a big effect on trees)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What methods can be used to adress the decision tree’s shortcomings?

A

Bagging, random forest and boosting

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the effect of bagging, random forest and boosting on the variance of decision trees?

A

They lower the variance

17
Q

Explain the bagging method.

A

Bagging is a form of bootstraping.

  1. Select B bootstrap samples from the n observations.
  2. Construct B trees
  3. Average the results of each trees
18
Q

What is a bootstrap sample?

A

Simulating a bootstrap sample of size n means drawing n items from the initial sample with replacement.

19
Q

What is the effect of bagging on a simple tree’s variance?

A

Divides it by B (# of bootstrap saples)

20
Q

Is there a danger of over fitting by making B too large in bagging?

A

No.

21
Q

What is out-of-bag (OOB) validation?

A

For n sufficently large, about 1/3 of the initial observations won’t be used in the bootstrap samples. For each tree, the test MSE can be computed using the OOB part of the sample. This eliminates the need for cross-validation.

22
Q

Explain the random forest method

A
  1. Specify a positive integer ‘m’

2. At each split, m predictors are selected randomly and those are the only predictors that are considered for splitting

23
Q

Is there a danger of over fitting by making B too large in random forest?

A

No.

24
Q

Why would we use random forest over bagging?

A

Bagged trees may be correlated. Selectng m predictors has the effect of decorrelating the trees.

25
Q

True or false : if m = k, random forest is reduced to bagging?

A

True.

26
Q

Is there a danger of over fitting by making B too large in boosting?

A

Yes.