8 Flashcards

1
Q

Question 1
In churn prediction in Telco, network features are usually
a) not predictive.
b) highly predictive.

A

b) highly predictive.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Question 2
A process is said to be a finite-valued Markov chain if P(Xt+1=j|X0=k0, X1=k1, …,Xt-1=kt-1, Xt=i)=
a) P(Xt=i) for all t, i, j where 1 ≤ i and j ≤ M.
b) P(Xt=i) for all t, i, j where 1 ≤ i and j ≤ M.
c) P(Xt+1=j|Xt=i) for all t, i, j where 1 ≤ i and j ≤ M.
d) P(Xt+1=j) for all t, i, j where 1 ≤ i and j ≤ M.

A

c) P(Xt+1=j|Xt=i) for all t, i, j where 1 ≤ i and j ≤ M.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Question 3
Consider the following statement “Essentially, a Markov process is a memoryless random process.”
This statement is
a) correct.
b) not correct.

A

a) correct.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Question 4
In a transition matrix, the sum of the probabilities
a) across the columns equals 1.
b) across the rows equals 1.

A

b) across the rows equals 1.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Question 5
Because of the Markov assumption, it is really easy to start doing simulations or projections by
a) adding the transition matrix to itself.
b) dividing the transition matrix by itself.
c) multiplying the transition matrix by itself.
d) subtracting the transition matrix from itself.

A

c) multiplying the transition matrix by itself.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Question 6
A Markov reward process is essentially a Markov chain with values which represent rewards assigned to
a) a state.
b) a transition.
c) a state or transition.

A

c) a state or transition.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Question 7
In a Markov Decision Process, the probability to move to a new state depends upon
a) the current state.
b) the action taken.
c) the current state and action taken.

A

c) the current state and action taken.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Question 8
The well-known bandit problem in reinforcement learning is an example of a
a) Markov Decision Process.
b) Markov Reward Process.

A

a) Markov Decision Process.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Question 9
In a Markov Decision Process, the discount factor motivates the decision maker to
a) favor taking actions early, rather than postpone them indefinitely.
b) favor taking actions late, rather than early.

A

a) favor taking actions early, rather than postpone them indefinitely.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Question 10
A Markov Decision Process can be solved by
a) brute force evaluation.
b) value iteration.
c) policy iteration.
d) dynamic programming.
e) all of the above.

A

e) all of the above.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Question 11
A mover/stayer model is a special type of Markov Chain model where we differentiate between 2 types of customers: movers who switch states and stayers who keep their states. The latter can represent stable, loyal customers since they never leave their initial state. Movers make transitions according to a Markov chain with transition matrix T. Let s represent the vector of stayers and m represent the vector of movers. The state transition after 1 period then becomes:
a) s×T + m.
b) (s + m)×T.
c) s + m×T.
d) s + m.

A

c) s + m×T.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Question 12
Customers can migrate between states due to different reasons such as
a) marketing actions.
b) competitor actions.
c) macro-economic effects.
d) changing customer needs.
e) all of the above.

A

e) all of the above.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Question 13
Consider the following statement: “The most extreme example of a stable migration matrix is the identity matrix I.”
This statement is
a) correct.
b) not correct.

A

a) correct.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Question 14
Modeling why customers migrate between states can be very useful,
a) from an explanatory perspective.
b) from a predictive perspective.
c) both from an explanatory as well as predictive perspective.

A

c) both from an explanatory as well as predictive perspective.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Question 15
A common technique to model customer migrations is
a) linear regression.
b) decision trees.
c) neural networks.
d) cumulative logistic regression.

A

d) cumulative logistic regression.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Question 16
Which statement is NOT CORRECT?
a) Empirical evidence has demonstrated that downgrades tend to be more easily followed by further downgrades such that you can often observe an autocorrelation effect over time.
b) A duration dependence effect sometimes occurs which implies that the longer a customer keeps the same state, the higher the migration probability.
c) Migration probabilities also tend to be correlated with business cycles.
d) Hidden Markov models are also referred to as dynamic segmentation. These are essentially algorithms that automatically uncover the different states of customer behavior as well as how those states evolve.

A

b) A duration dependence effect sometimes occurs which implies that the longer a customer keeps the same state, the higher the migration probability.

17
Q

Question 17
Which statement is NOT CORRECT?
a) Logistic regression is linear in the odds.
b) Because of its high interpretability, logistic regression is one of the most popular classification techniques used in industry for churn prediction. It is also used quite extensively in other settings, such as credit risk modeling.
c) Logistic regression essentially uses a non-linear transformation to transform a linear regression such that the outcome is always bounded between 0 and 1.
d) The logistic regression parameters are typically optimized using the idea of maximum likelihood.

A

a) Logistic regression is linear in the odds.

18
Q

Question 18
When penalizing regression coefficients, the L2 norm is based on
a) on the absolute values of the coefficients.
b) on the squared values of the coefficients.

A

b) on the squared values of the coefficients.

19
Q

Question 19
Elastic net penalizes the regression coefficients using
a) only the L1 norm.
b) only the L2 norm.
c) both the L1 and L2 norm.

A

c) both the L1 and L2 norm.

20
Q

Question 20
The core idea of ProfLogit is now to replace the regularized maximum likelihood objective function to
a) a regularized AUC based variant thereof.
b) a regularized lift based variant thereof.
c) a regularized profit based variant thereof.
d) a regularized precision based variant thereof.

A

c) a regularized profit based variant thereof.

21
Q

Question 21
The reason we use a real-coded genetic algorithm or RGA in ProfLogit is because
a) they are better than gradient descent methods.
b) they are faster than gradient descent methods.
c) it’s impossible to find the derivate of the objective function so we cannot resort to classical gradient descent based optimization methods.

A

c) it’s impossible to find the derivate of the objective function so we cannot resort to classical gradient descent based optimization methods.

22
Q

Question 22
Which statement is CORRECT?
a) In RGAs, the selection operator either deterministically or stochastically picks candidate solutions on the basis of their fitness values.
b) In RGAs, crossover recombines two or more selected chromosomes, generating one or more new chromosomes or children. That is, in crossover, values of the parent chromosomes are exchanged in a predefined manner.
c) In RGAs, mutation introduces random perturbations into the chromosome pool, creating new chromosomes that are often quite distinct than from crossover alone.
d) All statements are correct.

A

d) All statements are correct.

23
Q

Question 23
The termination criterion of the RGA could be
a) a fixed number of steps.
b) when the fitness value no longer significantly changes.
c) when the fitness on an independent validation set starts to decrease.
d) all of the above.

A

d) all of the above.

24
Q

Question 24
The empirical evaluation of ProfLogit demonstrated that ProfLogit has
a) overall the best performance in terms of the EMPC and MPC, thus being the most profitable churn model. However, it has the worst profit-based hit rate and recall as well as the worst F1 measure.
b) overall the best performance in terms of the EMPC and MPC, thus being the most profitable churn model. Additionally, it has the overall best profit-based hit rate and recall as well as the highest F1 measure.
c) overall the best performance in terms of the EMPC but not MPC.
d) overall the best performance in terms of the MPC but not EMPC.

A

b) overall the best performance in terms of the EMPC and MPC, thus being the most profitable churn model. Additionally, it has the overall best profit-based hit rate and recall as well as the highest F1 measure.

25
Q

Question 25
When constructing a churn model for maximum profit compared to constructing a churn model for maximizing accuracy-centric techniques, the variables considered as significant are
a) the same.
b) different.

A

b) different.

26
Q

Question 26
In ProfTree, the fitness function of a tree consists of
a) the expected maximum profit for churn, or EMPC of the tree.
b) the expected maximum profit for churn, or EMPC of the tree, minus lambda times its complexity.
c) the AUC of the tree, minus lambda times its complexity.
d) none of the above.

A

b) the expected maximum profit for churn, or EMPC of the tree, minus lambda times its complexity.

27
Q

Question 27
A classical CART tree typically returns
a) the same tree as a ProfTree.
b) a different tree than a ProfTree tree.

A

b) a different tree than a ProfTree tree.

28
Q

Question 28
Which statement about the empirical evaluation of ProfTree is CORRECT?
a) When averaging the ranks over the data sets use in the empirical evaluation, ProfTree has the overall best performance in terms of EMPC and MPC.
b) ProfTree exhibits the overall highest precision, so it most effectively identifies churners correctly.
c) ProfTree has the highest average recall, thus it is capable of detecting the most would-be churners.
d) ProfTree has the overall poorest MER performance because ProfTree maximizes EMPC, rather than minimizing the misclassification error.
e) In terms of AUC performance, ProfTree is ranked quite low.
f) All statements are correct.

A

f) All statements are correct.