Chapter 7 Flashcards
What is the general form of an expected value calculation ?
EV = p(o1)· v(o1) + p(o2)· v(o2) + p(o3)· v(o3) …
Where oi is a possible decision outcome;
p(oi) is its probability and
v(oi) is its value
What is the formula of “expected benefit of targeting” ?
Expected benefit of targeting = pR(x)· vR + 1 - pR(x) · vNR
Where vR is the value we get from a response and vNR is the value we get from no response.
Explain the following formula :
p(h , a) = count(h , a)/ T
count(h,a): count of the number of decisions corresponding to the corresponding combination of
(predicted, actual).
T: the total number of instances.
p(h,a): reduction of the counts to rates or estimated probabilities.
What is the component of a cost-benefit matrix ?
The cost-benefit matrix specifies for each (predicted, actual) pair, the cost or benefit of making such decision.
So with:
True positive, True negative –> the benefit b(Y,p) , b(N,n)
False positive, False negative –> the cost c(Y,n) , c(N,p)
What is the Expected profit equation with priors p(p) and p(n) factored ?
Expected profit =
p(p) · [p(Y | p)· b(Y, p) + p(N | p)·c(N, p)] +
p(n) · [p(N | n)· b(N, n) + p(Y | n)·c(Y, n)]
where :
Expected profit =
p(Y | p)· p(p)· b(Y, p) + p(N | p)· p(p) · b(N, p) +
p(N | n)· p(n)· b(N, n) + p(Y | n)· p(n) · b(Y, n)
Why should we factor out the probabilities of seeing each class in expressing expected profit ?
Factoring these out allows us to separate the influence of class imbalance from the fundamental predictive power of the model.
Explain the following formula:
Expected profit =
p(p) · [p(Y | p)· b(Y, p) + p(N | p)·c(N, p)] +
p(n) · [p(N | n)· b(N, n) + p(Y | n)·c(Y, n)]
p(p) · [p(Y | p)· b(Y, p) + p(N | p)·c(N, p)]
(the first one) corresponding to the expected profit from the positive examples.
p(n) · [p(N | n)· b(N, n) + p(Y | n)·c(Y, n)]
(the second one) corresponding to the expected profit from the negative examples
” if positives are very rare, their contribution to the overall expected profit will be correspondingly small “
Explain the following formula:
VT = EBT(x) - EBnotT(x)
VT = EBT(x) - EBnotT(x)
Value of targeting: is the difference between the expected benefit of targeting - the expected benefit of not targeting.
What is the goal of the learning curve ?
The learning curve helps us to understand the relationship between the amount of data and the resultant improvement in generalization performance.
What is the problem with simple classification accuracy as a metric ?
t it makes no distinction between false positive and false negative errors.
By counting them together, it makes the assumption that both errors are equally important.
” we should estimate the cost or benefit of each decision a classifier can make “
Whta are the benefits of the expected value computation ?
It provides a framework that is extremely useful in organizing thinking about data-analytic problems.
It decomposes data-analytic thinking into
(1) the structure of the problem.
(2) the elements of the analysis that can be extracted from the data.
(3) the elements of the analysis that need to be acquired from other sources (e.g., business knowledge).
Over which probability should we target each consumer in the dataset?
We use the formula of expected benefit of targeting:
pR(x)· vR + 1 - pR(x) · vNR