Neural networks Flashcards
General introduction
- A neural network is a biologically-inspired bodel
- It is not a model of neurons
- A neural network is a network of simple functions, alternative to one single very complex hypothesis
- There are several different types of NN: multi-layer perceptrons, radial basis functions, covolutional NNs, recurrent NNs
Multi-layer perceptron (MLP): task
- non-linear classification tasks
- it can be used to infer a target function using AND/OR operations at different levels
- by adding layers, complex functions can be approximated
Implementation of OR operation with a perceptron
OR(x1,x2) = sign(x1 + x2 + 1.5)
w = [1.5 1 1]' x = [1 x1 x2]'
Implementation of AND operation with a perceptron
AND(x1,x2) = sign(x1 + x2 - 1.5)
w = [-1.5 1 1]' x = [1 x1 x2]'
What feed-forward neural networks are?
- instead of using perceptrons, these NNs use the function
θ(s) = tanh(s) - input layer with d(0)+1 nodes
- one or more hidden layers
- hidden layer l with d(l)+1 nodes
- output layer with d(L) = 1 node
Universal approximation theorem
- Theorem by Cybenko
For any ε, a neural network with 1 hidden layer exists, such that
if f(x) is continuous in X.
h(x) - f(x) | < ε , for any x app X
How to train a NN
Application of the gradient method
Ein(w) = 1/N * sum(n=1,N) (yn - h(xn, w))^2
w(t+1) = w(t) - η * ∇Ein(w(t))
∇Ein(w(t)) = ∂Ein(w)/∂w | w=w(t) = ∂Ein/∂h * ∂h/∂w | w=w(t)
∂Ein/∂h = -2/N * sum(n=1,N) (yn - h(xn, w))
(easy to compute)
∂h/∂w | w=w(t) is the problem!
How to differentiate h wrt w
1) quick numerical finite difference
- complexity O(Q^2)
- Q number of weights
2) back propagation
Characteristics of deep neural networks
- A single hidden layer can approximate any target function (Cybenko)
- However, the use of many layers better mimic human Learning
- Advantage in term of interpretation
- More parameters to tune - > a lot of data are required
- Pre-trained networks can be used as some layers of a bigger network