Machine Learning with Viya® 3.4® Lesson 6: Model Assessment and Deployment Flashcards

You may prefer our related Brainscape-certified flashcards:
1
Q

How would you add a challenger model to a pipeline comparison in Model Studio?

A

Select the model in its pipeline.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Which assessment measure should be used to determine the champion model for predicting an interval target?

A

Average Square Error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How would you identify which model has the best classification accuracy using C-statistic values?

A

The model with highest C-statistic value has the best performance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Which dataset partition will be used to select the champion model when using the default settings in Model Studio?

A

Validation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What type of data is used during champion-challenger testing to compare the performance of the currently deployed model and a challenger model during the model deployment phase?

A

Champion-challenger testing compares performance on historic data during model deployment.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are the primary considerations for choosing an appropriate model selection statistic?

A
  • business needs
  • ​the prediction type
  • the measurement level
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

The confusion matrix is the foundation of which assessment plot?

A

the ROC chart

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

A confusion matrix helps you classify which type of target?

A

Binary

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Which validation method would you recommend for a small dataset?

A

Cross-validation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

A cumulative lift chart shows that a machine learning model has a lift of 2.6 at a depth of 10%. What does this mean?

A

For the top 10% of cases, the machine learning model captures 2.6 times more primary outcome cases than a random model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What model fit statistics are recommended for a decision prediction?

A

accuracy or misclassification

KS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are two commonly used performance statistics for estimate predictions?

A

Schwarz’s Bayesian Criterion (SBC)

Weighted Square Error?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What assessment measure would you use to assess the probability of a customer responding to a targeted ad campaign?

A

the Gains chart

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is another term for gains chart?

A

Cumulative Percentile Hits chart

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is another term for a Cumulative Lift chart?

A

a Gains chart

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is a cumulative lift chart?

A

A lift chart indicates how well the model did as compared to no model. The lift is the ratio between the result predicted by the model and the result using no model.

cumulative lift plotted as a percent on the vertical axis

18
Q

What’s the calculation for the lift for a given percentile when evaluating a Cumulative Captured Response (Gains) Chart?

A

Divide the Model Response Rate by the Random Response Rate:

Lift = (P,M) = CPH(P,M) / P

where P is a given percentile

19
Q

What are two things you can investigate in an ICE plot?

A

Subgroups and interactions among model variables.

20
Q

What do level differences in an ICE plot suggest?

A

group effects

21
Q

What does an intersecting slope in an ICE plot indicate?

A

Interactions between the plot variable and one or more additional model variables

22
Q

What are two things to look for in an ICE plot?

A
  1. intersecting slopes
  2. level differences
23
Q

Which machine learning model is the easiest to interpret?

A

Decision trees are highly interpretable because they are based on English rules, which are rules that use Boolean logic.

24
Q

Which dataset partition assists in comparing possible models?

A

Validation

25
Q

Which dataset partition generates the possible models?

A

Training data

26
Q

Which dataset partition should be used to assess how the final model generalizes to new data?

A

Test data

27
Q

What does a PD plot show you?

A

How model inputs affect the model’s performance

28
Q

What does publishing a model do?

A

Publishing a model is used to makes score available in CAS, Teradata or Hadoop.

29
Q

You want to predict the rankings of a target variable and have built multiple models. Which selection statistic should be used to compare your models?

A

a ROC Index or Gini Coefficient

30
Q

Which model fit statistics are commonly used for ranking predictions?

A

ROC Index

Gini Coefficient

31
Q

How do you register a model in Model Studio?

A

Select Register Models from the Project Pipeline menu on the Pipeline Comparison tab.

32
Q

What does registering a model do?

A

Registering the model in Model Studio makes the model available in Model Manager.

33
Q

Which of the following statements is true about an ROC chart?

a. The selection value of each point is displayed on the chart.
b. Each point on the chart corresponds to a specific fraction of the sorted data.
c. True positives are on the x axis.
d. For a perfect model, the ROC curve is a straight line from the bottom left corner to the top right corner of the plot.

A

In an ROC chart, each point corresponds to a specific fraction of the sorted data.

34
Q

What kind of model can you import using a Score Code Import node?

A

A Score Code import node can only import either an ASTORE model or a single DATA step file model.

35
Q

What must you do before you can score and manage a model in Model Studio?

A

Register the model in Model Studio.

36
Q

In model comparison, the best model has the highest value of which measure?

A

Sensitivity

37
Q

What is the true positive rate referred to as?

A

Sensitivity

38
Q

What is sensitivity?

A

The true positive rate

39
Q

Calculate Sensitivity:

A

divide the true positive decisions by the total number of known primary cases

Specificity = TruePositive / TruePositive + FalsePositive

40
Q

What is the true negative rate referred to as?

A

Specificity

41
Q

Calculate Specificity:

A

divide the true negative decisions by the total number of known secondary cases

Specificity = TrueNegative / TrueNegative + FalsePositive