Error Analysis Flashcards
Exam2
Precision
true positives/total predicted positives - accuracy of positive features
Recall
true positives/total actual positives - measure of how good the model is at predicting the positive class
Suppose we don’t want to over diagnose people with gluten intolerance since removing gluten from a diet can lead to nutrition deficiencies. What measure is more important, precision or recall? Why?
Precision is more important than Recall because a higher Precision would ensure that individuals diagnosed with gluten intolerance are more likely to truly have the condition
Suppose we want to make sure we catch an early form of treatable cancer, what measure is more important precision or recall? Why?
Recall is more important than Precision because maximizing Recall ensures that the model identifies as many cases of the early form of treatable cancer as possible, even if it results in some false positives. While false positives may lead to unnecessary medical procedures, the benefits of detecting and treating cancer at an early stage far outweigh the risks
What does the tree predict on a rainy day with weak wind?
YES, play tennis
How does a decision tree select a feature for the root node?
Choose the node that maximizes purity (or minimizes impurity) for all the samples in the training set
How does a decision tree select a feature for the root node?
Choose the node that maximizes purity (or minimizes impurity) for all the samples in the training set
Using information gain and entropy which is a better feature to split on, Hank or free?