Machine Learning with Viya® 3.4® Lesson 6: Model Assessment and Deployment Flashcards

Question 1

Q

How would you add a challenger model to a pipeline comparison in Model Studio?

Answer

A

Select the model in its pipeline.

Question 2

Q

Which assessment measure should be used to determine the champion model for predicting an interval target?

Answer

A

Average Square Error

Question 3

Q

How would you identify which model has the best classification accuracy using C-statistic values?

Answer

A

The model with highest C-statistic value has the best performance.

Question 4

Q

Which dataset partition will be used to select the champion model when using the default settings in Model Studio?

Answer

A

Validation

Question 5

Q

What type of data is used during champion-challenger testing to compare the performance of the currently deployed model and a challenger model during the model deployment phase?

Answer

A

Champion-challenger testing compares performance on historic data during model deployment.

Question 6

Q

What are the primary considerations for choosing an appropriate model selection statistic?

Answer

A

business needs
the prediction type
the measurement level

Question 7

Q

The confusion matrix is the foundation of which assessment plot?

Answer

A

the ROC chart

Question 8

Q

A confusion matrix helps you classify which type of target?

Question 9

Q

Which validation method would you recommend for a small dataset?

Answer

A

Cross-validation

Question 10

Q

A cumulative lift chart shows that a machine learning model has a lift of 2.6 at a depth of 10%. What does this mean?

Answer

A

For the top 10% of cases, the machine learning model captures 2.6 times more primary outcome cases than a random model.

Question 11

Q

What model fit statistics are recommended for a decision prediction?

Answer

A

accuracy or misclassification

KS

Question 12

Q

What are two commonly used performance statistics for estimate predictions?

Answer

A

Schwarz’s Bayesian Criterion (SBC)

Weighted Square Error?

Question 13

Q

What assessment measure would you use to assess the probability of a customer responding to a targeted ad campaign?

Answer

A

the Gains chart

Question 14

Q

What is another term for gains chart?

Answer

A

Cumulative Percentile Hits chart

Question 15

Q

What is another term for a Cumulative Lift chart?

Answer

A

a Gains chart

Question 16

Q

Question 17

Q

What is a cumulative lift chart?

Answer

A

A lift chart indicates how well the model did as compared to no model. The lift is the ratio between the result predicted by the model and the result using no model.

cumulative lift plotted as a percent on the vertical axis

Question 18

Q

What’s the calculation for the lift for a given percentile when evaluating a Cumulative Captured Response (Gains) Chart?

Answer

A

Divide the Model Response Rate by the Random Response Rate:

Lift = (P,M) = CPH(P,M) / P

where P is a given percentile

Question 19

Q

What are two things you can investigate in an ICE plot?

Answer

A

Subgroups and interactions among model variables.

Question 20

Q

What do level differences in an ICE plot suggest?

Answer

A

group effects

Question 21

Q

What does an intersecting slope in an ICE plot indicate?

Answer

A

Interactions between the plot variable and one or more additional model variables

Question 22

Q

What are two things to look for in an ICE plot?

Answer

A

intersecting slopes
level differences

Question 23

Q

Which machine learning model is the easiest to interpret?

Answer

A

Decision trees are highly interpretable because they are based on English rules, which are rules that use Boolean logic.

Question 24

Q

Which dataset partition assists in comparing possible models?

Answer

A

Validation

Question 25

Q

Which dataset partition generates the possible models?

Answer

A

Training data

Question 26

Q

Which dataset partition should be used to assess how the final model generalizes to new data?

Answer

A

Test data

Question 27

Q

What does a PD plot show you?

Answer

A

How model inputs affect the model’s performance

Question 28

Q

What does publishing a model do?

Answer

A

Publishing a model is used to makes score available in CAS, Teradata or Hadoop.

Question 29

Q

You want to predict the rankings of a target variable and have built multiple models. Which selection statistic should be used to compare your models?

Answer

A

a ROC Index or Gini Coefficient

Question 30

Q

Which model fit statistics are commonly used for ranking predictions?

Answer

A

ROC Index

Gini Coefficient

Question 31

Q

How do you register a model in Model Studio?

Answer

A

Select Register Models from the Project Pipeline menu on the Pipeline Comparison tab.

Question 32

Q

What does registering a model do?

Answer

A

Registering the model in Model Studio makes the model available in Model Manager.

Question 33

Q

Which of the following statements is true about an ROC chart?

a. The selection value of each point is displayed on the chart.
b. Each point on the chart corresponds to a specific fraction of the sorted data.
c. True positives are on the x axis.
d. For a perfect model, the ROC curve is a straight line from the bottom left corner to the top right corner of the plot.

Answer

A

In an ROC chart, each point corresponds to a specific fraction of the sorted data.

Question 34

Q

What kind of model can you import using a Score Code Import node?

Answer

A

A Score Code import node can only import either an ASTORE model or a single DATA step file model.

Question 35

Q

What must you do before you can score and manage a model in Model Studio?

Answer

A

Register the model in Model Studio.

Question 36

Q

In model comparison, the best model has the highest value of which measure?

Answer

A

Sensitivity

Question 37

Q

What is the true positive rate referred to as?

Answer

A

Sensitivity

Question 38

Q

What is sensitivity?

Answer

A

The true positive rate

Question 39

Q

Calculate Sensitivity:

Answer

A

divide the true positive decisions by the total number of known primary cases

Specificity = TruePositive / TruePositive + FalsePositive

Question 40

Q

What is the true negative rate referred to as?

Answer

A

Specificity

Question 41

Q

Calculate Specificity:

Answer

A

divide the true negative decisions by the total number of known secondary cases

Specificity = TrueNegative / TrueNegative + FalsePositive