Chapter 7: Artificial Neural Networks Flashcards
Give the three types of neural network models
single neuron model
single layer perceptron
multilayer perceptron
how is the output of a neuron created
activation function * ( sum of (weights * input) + bias )
what does an activation function do
squishes the amplitude range of the output signal
give some activation functions
identity
threshold
sigmoid/ tanh
Relu
what is the identity function
no change φ(v) = v
what is the symbol for the activation function
phi φ
what is the threshold function
maps to {-1 , 1}
φ(v) = 1 if v >= 0
-1 if v < 0
what is the sigmoid function
maps to { 0 , 1 }
1 / 1 + exp (-v)
give the tanh sigmoid function
maps to { -1, 1}
exp( 2v ) - 1
—————-
exp( 2v ) + 1
give the rectified linear unit function
maps to ( v, 0 )
φ(v) = v if v >= 0
0 if v < 0
what does relu stand for
rectified linear unit
describe a single layer perceptron
one input layer
one output layer
what are hidden layers
between the input and output
creates complex function
what is a multi layer perceptron
also called a feed forward neural network
consists of at least three layers
what are the hyperparameters in a neural network
the number of hidden layers
the number of neurons in each layer
the number of hidden layers
what is training in neural networks
the process of finding the optimal settings for the weights
what are the two training methods for a neural network
Hebbian learning
gradient descent
what is Hebbian learning
calculate weights based on the values of the nodes at either end
wij = learning rate * xi * xj
how is gradient descent different in a neural network
we cant set to 0 as this is an unsolvable equation
what is the perceptron algorithm
train a single neuron by minimising the perceptron criterion using SGD.
The activation function is identity and the weights and bias equivalent to coefficient vector of linear model
what is the perceptron criterion (describe)
how we minimise the error function in the perceptron algorithm
give the equation of the perceptron criterion
O(w) = - sum yi (W^T Xtilde i)
how do we minimise the perceptron criterion
SGD weight update
give the equations for SGD weight update
Oi(w) = -yi W^T Xtildei
derivative respect to w = -yixi
hence
w^(t+1) = W^t + learningrate * yi * xi
what is a feed forward netural network
calculate error front to end
what is cross entropy loss
plug a logistic regression model into the prediction (output) layer
(e.g. softmax)
how do we regularise a neural network
add a regularisation term to the final optimisation objective function for training
give the equation for l2 regularisation
O(wnn) = loss(wnn) + lamda * 0.5 * || wnn || 22
describe the steps of backpropagation
- calculate loss function (feed forward)
- expand to get a full equation for each output node
- calculate the sum of least squares error
- apply the chain rule