8+9 Flashcards
What is the main limit of perceptrons? What is a way to fix this?
A perceptron splits data into two half spaces, if the boundary needs to be non linear this is impossible. If we make the perceptron multi layered we can get around this.
What is a layer? What is a single layer network good for?
A number of neurons connected to the same inputs(with different weights) forms a layer. A single layer network is good for multi-class classification.
How does error backpropagation work?
Take the error associated with output and distribute amongst units which provided input, the strongest connected inputs take the most blame. This continues until at the input layer. Once all blame has been distributed each neuron will change its weights and biases to reduce its share of blame.
Can multi layer networks be done with a hard limiting function?
No they need a sigmoid activation function(or other continuous function with a non zero derivative).
What are all the steps in the backpropagation algorithm?
- Initial guess all parameter values.
- forward phase, make forward computation to determine output, save output at each layer.
- Compute cost based on network output and desired output.
- backpropagate blame for error in network and update parameters.
- go back to step 2.
What are the main activation functions?
Sigmoid(0,1) , tanh(-1, 1), linear, rectified linear unit(linear if value is not less than 0. 0 in this case.)
Which output neuron should be chosen for classification problems?
The one with the greatest confidence.
What is the cross-entropy cost?
A cost system better than MSE for classification problems.
What is the softmax activation function?
Neuron output from activity is normalised by total activity of all output neurons, producing a probability distribution over classes?
What is a feed forward neural network What about a recurrent neural network?
Feed forward is a directed acyclic graph, meaning the internal state only depends on the current input. A recurrent neural network includes cycles, allowing the model to show a dynamic temporal behaviour.
What does a SRN involve?
Tanh activity function in hidden layer, softmax in output. Is a recurrent network, hidden layer feeds back in on self.