Classification and Regression Tree Flashcards
How to create classification and regression tree?
- )Build a tree by splitting on variables
- )Follow the split and predict the most frequent outcome
- )Interpretable
- )Does not assume a linear model
When does CART stop splitting?
One way is to setting lower bound. R uses minbucket parameter
General formula for creating tree
library(rpart)
model = (Y~X, data = train, method = “class”, minbucket = 25)
predictions = predict(model, newdata = test, type = “class”)
table(test$Y, predictions)
Explain sensitivity
TP / (TP + FN)
What percentage of True cases correctly identified from all True cases.
Explain Specificity
TN/(TN+FP)
What percentage of False cases correctly identified form all False cases.
What is bagged/bootstrapped sample of data?
each sample used to create a model select randomly with replacement.
What is K-fold cross validation?
split training set into k pieces. use k-1 pieces to estimate the model and kth model (validatioan model) to test each parameter value.
What is CP value?
cp is complexity parameter. measure trade off between model complexity and accuracy on training set.