Classification and Regression Tree Flashcards

1
Q

How to create classification and regression tree?

A
  • )Build a tree by splitting on variables
  • )Follow the split and predict the most frequent outcome
  • )Interpretable
  • )Does not assume a linear model
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

When does CART stop splitting?

A

One way is to setting lower bound. R uses minbucket parameter

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

General formula for creating tree

A

library(rpart)
model = (Y~X, data = train, method = “class”, minbucket = 25)
predictions = predict(model, newdata = test, type = “class”)

table(test$Y, predictions)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Explain sensitivity

A

TP / (TP + FN)

What percentage of True cases correctly identified from all True cases.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Explain Specificity

A

TN/(TN+FP)

What percentage of False cases correctly identified form all False cases.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is bagged/bootstrapped sample of data?

A

each sample used to create a model select randomly with replacement.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is K-fold cross validation?

A

split training set into k pieces. use k-1 pieces to estimate the model and kth model (validatioan model) to test each parameter value.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is CP value?

A

cp is complexity parameter. measure trade off between model complexity and accuracy on training set.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly