Machine Learning with Viya® 3.4® Lesson 5: Support Vector Machines (SVG) and Additional Topics Flashcards

Question 1

Q

What is a dot product?

Answer

A

A dot product is a way to multiply vectors that result in a scalar, or a single number, as the answer. It is an element-by-element multiplication, and then a sum across the products.

Question 2

Q

How is a support vector machine constructed in order to avoid the curse of dimensionality?

Answer

A

By using only the observations closest to the separating hyperplane

Question 3

Q

How does using only the observations closest to the separating hyperplane avoid the curse of dimensionality?

Answer

A

By limiting the number of data points in the solution.

Question 4

Q

What kind of information is in the Training Results table in an SVM run?

Answer

A

The Training Results table shows the parameters for the final Support Vector Machine model such as the number of support vectors and the bias.

Question 5

Q

Where can you find the average square error on the VALIDATE partition?

Answer

A

In the Fit Statistics table on the Assessment tab.

Question 6

Q

Where can you view the misclassification matrix?

Answer

A

The Output Window

Question 7

Q

What are the two constraints used to solve for optimization in a support vector machine?

Answer

A

If the target variable equals 1, then H must be greater than or equal to 1. If the target is -1, then H must be less than or equal to -1.

Question 8

Q

What’s a term for describing data points that are not linearly separable?

Answer

A

soft margin hyperplane

Question 9

Q

What do you need to do when you encounter a soft margin hyperplane?

Answer

A

Account for errors that the separating hyperplane might make

Question 10

Q

TRUE or FALSE: When the data are not linearly separable, the process of optimizing the location of the hyperplane must account for classification errors.

Answer

A

TRUE: When the data are not linearly separable, the hyperplane will misclassify some data points. In this situation, the process of optimizing the location of the hyperplane must account for these classification errors.

Question 11

Q

What is a kernel function?

Answer

A

A kernel function operates as a dot product in a higher dimension (that is, in a feature space), but it is applied to the raw data.

Question 12

Q

Suppose you are modeling data with a binary target and three inputs. The data are linearly separable. How many possible solutions exist that classify the target?

Answer

A

an infinite number of solutions can classify the binary target when the data are linearly separable

Question 13

Q

What type of target variable is supported in a support vector machine in Model Studio?

Answer

A

Support vector machines are used exclusively with binary targets in Model Studio.

Question 14

Q

What are the elements of a classifier model for a Support Vector Machine?

Answer

A

The classifier model (H) has two elements: a normal vector and a bias term

Question 15

Q

What is the maximum-margin hyperplane in a two-dimensional input space?

Answer

A

the exact center of the thickest line that touches the innermost values of one target outcome and the innermost values of the other target outcome

Question 16

Q

What are support vectors?

Answer

A

Support vectors are the points in the data that are closest to the maximum-margin hyperplane.

Question 17

Q

In support vector machines, finding the separating hyperplane is an optimization problem with constraints that involve the values of the binary target.

a. True
b. False

Answer

A

A: True

Solving for the support vector machine is actually an optimization problem with two constraints. The first constraint is based on a target value of +1, and the second constraint is based on a target value of -1.

Question 18

Q

How is a feature space is constructed?

Answer

A

A feature space is constructed by applying a nonlinear transformation to data so that linear separation exists in this higher-dimensional space

Question 19

Q

What is a kernel function?

Answer

A

C: A kernel function is a math trick used to avoid having to calculate dot products on transformed data.

Question 20

Q

What information is provided in the Local Interpretable Model-Agnostic Explanation (LIME) plot when the Model Interpretability feature is used in Model Studio?

Answer

A

A LIME plot creates a localized linear regression model around a particular observation based on a perturbed sample set of data.

Question 21

Q

How is the Input Relative Importance table that appears in the results calculated when the Model Interpretability feature is used in Model Studio?

Answer

A

The Input Relative Importance table is calculated by depth-one decision trees using each input to estimate the predicted values of the model being interpreted.

Question 22

Q

What are the three options used to increase the flexibility of a support vector machine model in Model Studio?

Answer

A

Penalty, kernel, and tolerance

Question 23

Q

What is the penalty term?

Answer

A

The penalty is a term that accounts for misclassification errors in model optimization.

Question 24

Q

What is tolerance?

Answer

A

The tolerance value balances the number of support vectors and model accuracy.

Question 25

Q

Which of the following machine learning models is the easiest to interpret?

a. decision tree
b. neural network
c. support vector machine

Answer

A

Decision trees are highly interpretable because they are based on English rules, which are rules that use Boolean logic.

Question 26

Q

What does the Penalty value do?

Answer

A

The Penalty value balances model complexity and training error.

Question 27

Q

What is the risk associated with a larger Penalty value?

Answer

A

A larger Penalty value creates a more robust model at the risk of overfitting the training data.

Question 28

Q

What does the Tolerance value do?

Answer

A

The Tolerance value balances the number of support vectors and model accuracy.

Question 29

Q

What is the consequence of too large a Tolerance value?

Answer

A

A Tolerance value that is too large creates too few support vectors.

Question 30

Q

What is the consequence of a Tolerance value that is too small?

Answer

A

A Tolerance value that is too small overfits the training data.

Question 31

Q

What does an intersecting slope in an ICE Plot indicate?

Answer

A

An Intersecting slope indicates that there is an interaction between the plot variable and one or more complementary variables.

Question 32

Q

Why is it useful to look among clusters for different relationships between the groups (or levels) of the categorical variable and the target when evaluating an ICE plot of a categorical input?

Answer

A

Significant differences in these relationships indicate group effects.

Question 33

Q

Where is the largest possible margine of error in a minimum-maximum hyperplane?

Answer

A

This hyperplane has the largest possible margin of error on its positive and negative sides.

Question 34

Q

What does the autoencoder method on the Feature Extraction node do?

Answer

A

The Autoencoder method builds a neural network that uses the inputs to reconstructs the inputs.

Question 35

Q

How is an autoencoder network different than an MLP network?

Answer

A

An autoencoder network is like an MLP network except that its output layer is duplicated from the input layer.

Question 36

Q

When you scale the input variables for a binary target using support vector machines, what happens to the inputs?

Answer

A

Values are scaled to range from 0 to 1.

Question 37

Q

How are missing values for class variables handled when “Use missing” is specified for a Support Vector Machine node?

Answer

A

SVMs treat missing values as a separate category.

Question 38

Q

What is the only Global Interpretability plot available in Model Studio?

Answer

A

Partial Dependence plots are based on an aggregation across all observations, thus they provide global interpretability

Question 39

Q

Which model interpretability tools can be used to help interpret a machine learning model for a single observation?

Answer

A

Local Interpretable Model-Agnostic Explanations (LIME) plots, Kernel SHAP (Shapley) plots

Question 40

Q

Where does the Open Source Code node execute?

Question 41

Q

Which assessment measure should be used to determine the champion when predicting an interval target?

Answer

A

Average Squared Error

Question 42

Q

Which assessment measure should be used to determine the champion for a decision focused model?

Answer

A

Misclassification Rate

Question 43

Q

Which assessment measure would you use to determine the champion for a model used for ranking?

Answer

A

The ROC Index or Gini Coefficient