Chapter 7 Flashcards
When discussing classifiers, how are outcomes generally referred to?
- Bad outcome is referred to as a “positive” example
- Is noteworthy of an alarm (e.g. in a medical test)
- Good outcome is referred to as a “negative” example
- Classifiers go through a large population consisting mainly of negatives and look for a small number of positive instances
Which one is usually more costly, false negative or false positive?
A false negative, as these could be patients with a disease that are not indentified.
What are false positives and false negatives?
A false positive would result in an innocent party being found guilty.
A false negative would produce an innocent verdict for a guilty person.
What does Expected Value provide to data-analytic problems?
- Expected value computation decomposes data-analytic thinking into three classes
- The structure of the problem
- The elements of the analysis that can be extracted from the data
- The elements of the analysis that need to be acquired from other sources
What is the general Expected Value equation?
How can the expected value calculation be applied?
- Expected value calculation can be used to determine e.g. above which threshold (probability of acceptance) a customer should be targeted
- Alternative application of expected value calculation is to evaluate one model, compared to another → classifier comparison
- What is the EV of a given model compared to another?
Expected Benefit: What is it’s calculation and how do we find out if we should target a customer?
How to find out (at what estimated probability of responding) if we should target a customer?
Target a customer if the estimated prob. of responding is higher than the value found using expected benefits equation.
What is a common way of expressing the expected profit equation?
How do the expected profit’s conditional probabilities relate to the rates?
How do the cost benefit cells relate to their specific values in the equation?
p(Y | p) relates to the tp rate (top left)
p(N | p) relates to the fn rate (bottom left)
p(N | n) relates to the tn rate (bottom right)
p(Y | n) relates to the fp rate (top right)
Cost benefit cells relate in the same way.
Based on Cost-Benefit matrix of (99 , -1, / 0, 0) and the following confusion matrix, calculate the expected profit.
Book has rounding errors, you can use the simple equation as well, all you need to do is use 56/110 and 7/110.
What are some other distance functions other than Euclidean distance?