Introduction Linear models and search Flashcards
Describe a perceptron
Multiple features are multiplied with weights, a bias parameter is added to the resulting sum. It is basically a linear classifier.
How can we include a nonlinearity to a perceptron?
Use an activation function like the logistic sigmoid
For what is a softmax activation used?
Multiclass classification. All output nodes sum to 1
What is stochastic gradient descent?
It is a variant of the gradient descent algorithm, but instead of using the entire dataset to compute the gradient at each iteration, it randomly samples a small subset of the data and computes the gradient on that subset.
Can you give a summary of how to train a neural network?
Define a loss function
Work out the gradient with respect to the weights
Use (stochastic) gradient descent to improve the weights
Give the 3 backpropogation steps
Work out the derivative of the output wrt its inputs symbolically
Compute the global derivative by multiplying all derivatives