Lecture 5 Flashcards
Feature extraction
starts from an initial set of measured data and builds derived values (features) intended to be informative and non-redundant, facilitating the subsequent learning and generalization steps, and in some cases leading to better human interpretations.
Neural Network approach to construct a non-linear classifier
Uses a large number of simpler activation functions
The functions are fixed (Gaussian, sigmoid, polynomial basis functions)
and optimization involves linear combinations of these fixed functions
Artificial Neural Networks
Take inspiration from the brain
define functions that are computed by neurons (units)
and have both input and output units with hidden layers
What’s the relationship between the number of hidden layers and the network’s capacity
the capacity of the network increases with more hidden units and hidden layers
What are the components of neural networks
An input layer x; independent variable
an arbitrary amount of hidden layers
an output layer y (hat); dependent variable
a set of weights (coefficients) and biases at each layer
and a choice of activation functions for each layer
Activation functions
Are applied on the hidden units and achieve non-linearity
What are some popular activation functions?
Sigmoid, Tanh, and ReLU (Rectified Linear Unit)
What are the components of neural network training?
A forward pass and a backward pass
forward pass
performs inference
backward pass
performs learning;
is a routine to compute the gradient and uses the chain rule of the derivative of the loss function ;
weights and biases can be changed to reduce error
Back propagation
an efficient method for computing gradients needed to perform gradient-based optimization of the weights in a multi-layer network
Deep neural network
a multilayer perceptron or a neural network with multiple hidden layers
Which of the following introduces non-linearity to a neural network?
Rectified Linear Unit (ReLU) function, Convolution function, or
Stochastic Gradient Descent
Rectified Linear unit, it is a non-linear activation function.