Machine Learning with Viya® 3.4® Lesson 5: Support Vector Machines (SVG) and Additional Topics Flashcards

1
Q

What is a dot product?

A

A dot product is a way to multiply vectors that result in a scalar, or a single number, as the answer. It is an element-by-element multiplication, and then a sum across the products.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How is a support vector machine constructed in order to avoid the curse of dimensionality?

A

By using only the observations closest to the separating hyperplane

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How does using only the observations closest to the separating hyperplane avoid the curse of dimensionality?

A

By limiting the number of data points in the solution.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What kind of information is in the Training Results table in an SVM run?

A

The Training Results table shows the parameters for the final Support Vector Machine model such as the number of support vectors and the bias.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Where can you find the average square error on the VALIDATE partition?

A

In the Fit Statistics table on the Assessment tab.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Where can you view the misclassification matrix?

A

The Output Window

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are the two constraints used to solve for optimization in a support vector machine?

A

If the target variable equals 1, then H must be greater than or equal to 1. If the target is -1, then H must be less than or equal to -1.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What’s a term for describing data points that are not linearly separable?

A

soft margin hyperplane

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What do you need to do when you encounter a soft margin hyperplane?

A

Account for errors that the separating hyperplane might make

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

TRUE or FALSE: When the data are not linearly separable, the process of optimizing the location of the hyperplane must account for classification errors.

A

TRUE: When the data are not linearly separable, the hyperplane will misclassify some data points. In this situation, the process of optimizing the location of the hyperplane must account for these classification errors.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is a kernel function?

A

A kernel function operates as a dot product in a higher dimension (that is, in a feature space), but it is applied to the raw data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Suppose you are modeling data with a binary target and three inputs. The data are linearly separable. How many possible solutions exist that classify the target?

A

an infinite number of solutions can classify the binary target when the data are linearly separable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What type of target variable is supported in a support vector machine in Model Studio?

A

Support vector machines are used exclusively with binary targets in Model Studio.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are the elements of a classifier model for a Support Vector Machine?

A

The classifier model (H) has two elements: a normal vector and a bias term

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the maximum-margin hyperplane in a two-dimensional input space?

A

the exact center of the thickest line that touches the innermost values of one target outcome and the innermost values of the other target outcome

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are support vectors?

A

Support vectors are the points in the data that are closest to the maximum-margin hyperplane.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

In support vector machines, finding the separating hyperplane is an optimization problem with constraints that involve the values of the binary target.

a. True
b. False

A

A: True

Solving for the support vector machine is actually an optimization problem with two constraints. The first constraint is based on a target value of +1, and the second constraint is based on a target value of -1.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

How is a feature space is constructed?

A

A feature space is constructed by applying a nonlinear transformation to data so that linear separation exists in this higher-dimensional space

19
Q

What is a kernel function?

A

C: A kernel function is a math trick used to avoid having to calculate dot products on transformed data.

20
Q

What information is provided in the Local Interpretable Model-Agnostic Explanation (LIME) plot when the Model Interpretability feature is used in Model Studio?

A

A LIME plot creates a localized linear regression model around a particular observation based on a perturbed sample set of data.

21
Q

How is the Input Relative Importance table that appears in the results calculated when the Model Interpretability feature is used in Model Studio?

A

The Input Relative Importance table is calculated by depth-one decision trees using each input to estimate the predicted values of the model being interpreted.

22
Q

What are the three options used to increase the flexibility of a support vector machine model in Model Studio?

A

Penalty, kernel, and tolerance

23
Q

What is the penalty term?

A

The penalty is a term that accounts for misclassification errors in model optimization.

24
Q

What is tolerance?

A

The tolerance value balances the number of support vectors and model accuracy.

25
Q

Which of the following machine learning models is the easiest to interpret?

a. decision tree
b. neural network
c. support vector machine

A

Decision trees are highly interpretable because they are based on English rules, which are rules that use Boolean logic.

26
Q

What does the Penalty value do?

A

The Penalty value balances model complexity and training error.

27
Q

What is the risk associated with a larger Penalty value?

A

A larger Penalty value creates a more robust model at the risk of overfitting the training data.

28
Q

What does the Tolerance value do?

A

The Tolerance value balances the number of support vectors and model accuracy.

29
Q

What is the consequence of too large a Tolerance value?

A

A Tolerance value that is too large creates too few support vectors.

30
Q

What is the consequence of a Tolerance value that is too small?

A

A Tolerance value that is too small overfits the training data.

31
Q

What does an intersecting slope in an ICE Plot indicate?

A

An Intersecting slope indicates that there is an interaction between the plot variable and one or more complementary variables.

32
Q

Why is it useful to look among clusters for different relationships between the groups (or levels) of the categorical variable and the target when evaluating an ICE plot of a categorical input?

A

Significant differences in these relationships indicate group effects.

33
Q

Where is the largest possible margine of error in a minimum-maximum hyperplane?

A

This hyperplane has the largest possible margin of error on its positive and negative sides.

34
Q

What does the autoencoder method on the Feature Extraction node do?

A

The Autoencoder method builds a neural network that uses the inputs to reconstructs the inputs.

35
Q

How is an autoencoder network different than an MLP network?

A

An autoencoder network is like an MLP network except that its output layer is duplicated from the input layer.

36
Q

When you scale the input variables for a binary target using support vector machines, what happens to the inputs?

A

Values are scaled to range from 0 to 1.

37
Q

How are missing values for class variables handled when “Use missing” is specified for a Support Vector Machine node?

A

SVMs treat missing values as a separate category.

38
Q

What is the only Global Interpretability plot available in Model Studio?

A

Partial Dependence plots are based on an aggregation across all observations, thus they provide global interpretability

39
Q

Which model interpretability tools can be used to help interpret a machine learning model for a single observation?

A

Local Interpretable Model-Agnostic Explanations (LIME) plots, Kernel SHAP (Shapley) plots

40
Q

Where does the Open Source Code node execute?

A

CAS

41
Q

Which assessment measure should be used to determine the champion when predicting an interval target?

A

Average Squared Error

42
Q

Which assessment measure should be used to determine the champion for a decision focused model?

A

Misclassification Rate

43
Q

Which assessment measure would you use to determine the champion for a model used for ranking?

A

The ROC Index or Gini Coefficient