Machine Learning with SAS Viya Flashcards
What is the learning rate?
The learning rate is a training parameter that controls the size of weight and bias changes in learning of the training algorithm.
What method is commonly used to train neural networks?
Neural networks are often trained by weight decay methods.
How is a neural network optimized?
An error function for a neural network, with the goal to minimize this error function.
What is weight decay?
Weight decay refers to the weight penalties used to keep the weights small, close to zero, or zero.
What does the L1 regularization term hyperparameter do?
L1 norm: Penalizes the absolute value of the weight. Tends to drive some weights to exactly zero.
What does the L2 regularization term hyperparameter do?
L2 norm: Penalizes the square value of the weight (which explains the 2 in the name). Tends to drive all the weights to smaller values.
What is the momentum term?
The momentum term is a fraction of the last update vector and is added to the new weight value inorder to prevent a rapid change in direction during the search.
How does the momentum term parameter help optimize a neural network?
The momentum termensures that the values of the weights do not oscillate around the true minima, but rather converge to it
What are some disadvantages when using support vector machines?
Support vector machines tend to be black boxes that provide results that are difficult to interpret.
What are some of the advantages of using support vector machines?
Support vector machines are extremely flexible and can automatically discover any relationship between two variables.
What were support vector machines originally developed for?
Like decision trees, support vector machines
were originally developed simply to classify outcomes.
In other words, the model makes decision predictions
instead of ranks or estimates.
What are support vector machines used for in Model Studio?
Support vector machines
are used exclusively with binary targets, to provide decisions, ranks,
and probability estimates in Model Studio.
What does it mean when a binary outcome is “linearly separable”?
It is possible to draw a line that perfectly
separates the target outcomes into two classes (without errors) when an outcome is linearly separable. That line is the support vector machine.
What are some other terms for a support vector machine?
Classifier, classifier model, or classification rule
What is the term for the classifier when there are three dimensions in a support vector machine model?
Plane