Quiz 5 Flashcards

1
Q

separating observations into subgroups by creating splits or predictors

A

trees

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what are the types of trees?

A

classification tree
regression tree

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

what does CART stand for?

A

classification and regression trees

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

nodes that have successors, also called splitting nodes

A

decision node

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

nodes with no successors, leaves of the tree, represents partitioning of data by predictors

A

terminal node

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

how many terminal nodes are there?

A

one more than decision nodes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

how are things moved down the tree?

A

they are dropped

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

how are classes assigned?

A

taking vote/average of class with the most similarities

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

divides up the p-dimensional space of the x variables into non-overlapping multidimensional rectangles, operates on results of prior division

A

recursive partitioning

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what is a pure rectangle?

A

only contains to one class

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

each split depicted as split of a node into two successor nodes

A

classification tree algorithms

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

statistical test to assess whether splitting a node improves the purity by a statistically significant amount

A

chi-squared automatic interaction detection (CHAID)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

removing the weakest branches that hardly reduce the error rate, successively selecting decision nodes and re-designating it as a terminal node

A

pruning the tree

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

misclassification error + penalty factor for size of tree

A

cost complexity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

what does the minimum error tree have?

A

lowest misclassification error on validation set

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

smallest tree in pruning sequence with error within one standard deviation of the minimum error tree

A

best-pruned tree

17
Q

fit classification to different samples and then combine them, random subset of observation and predictors and make a tree, repeat a bunch, take a vote among trees/average

A

random forest

18
Q

sequence of trees is fitted so each tree concentrates on misclassified records from the previous tree, make a tree, see how model performs, focus on incorrect predictors and make tree for it, take average

A

boosted trees/gradient boosting

19
Q

1 minus sum of p of k squred

A

gini measure

20
Q

what is p of k

A

percentage of observations in rectangle A that belong to class k

21
Q

what is the perfect gini measure

A

0

22
Q

-sum of p of k times log 2 times p of k

A

entropy measure

23
Q

percentage of data used for each tree in the random forest

A

bootstrap percentage