Week 2 Flashcards
Defining characteristic of DNN
More than 1 hidden layer
Advantages of SGD
Efficient for large sample
Implementable numerically
Can be ‘controlled’
RNN
Recurrent Neural networks sequentially feed output back into network
Connection from FNN to RNN
RNNs can be reduced to FNNs by UNFOLDING them
First NN
McCulloch and Pitts
Perceptron & problems of
Rosenblatt ‘58
What started AI winter
‘69 Minsky showed XOR couldn’t be replicated by perceptron
FULLY CONNECTED
If all entries of each L_i in NN are non zero
Universal approximation property
Let g:R -> R be a measurable function such that:
a) g is not a polynomial function
Define FNN
Differences between (hyper)params
Hyper:
Set by hand
Features
Params:
Chosen by machine (weights and biases)
Optimised by SGD
Architecture of network
Hyperparameters and Activatuon Functions (things chosen by you)
Dense layer
Entire layer is connected (non zero)
Number of parameters that characterise N
Adding units