Perceptron Flashcards
Who came up with the perceptron and what year?
Rosenblatt 1957
What does the perceptron model do?
The perceptron model takes an input, aggregates it (weighted sum) and returns 1 only if the aggregated sum is more than some threshold else returns 0.
What can a single perceptron do?
A single perceptron can only be used to implement linearly separable functions, it takes both real and Boolean inputs and associates a set of weights to them, along with a bias (threshold)
How does the perceptron model allow you to perform pattern recognition?
allows you to find the w vector that can perfectly classify positive inputs and negative inputs in your data. We initialize w with some random vector. We then iterate over all the examples in the data, (P U N) both positive and negative examples. Now if an input x belongs to P, the dot product w.x will be greater than or equal to 0 and if x belongs to N, the dot product MUST be less than 0. Basically, perceptron uses dot product for pattern recognition.
What in particular is the linear perceptron not (and why in particular)?
Linear models like the perceptron with a Heaviside activation function are not universal function approximators; they cannot represent some functions.
What are linear models limited to?
Linear models can only learn to approximate the functions for linearly separable datasets
What do linear classifiers do and why does this limit them?
find a hyperplane that separates the positive classes from the negative classes; if no hyperplane exists that can separate the classes, the problem is not linearly separable
many problems are not linearly separable
What is an example of a non-linearly separable problem?
XOR gate
What are the input/output target pairs for the XOR gate?
{p1 = [0/0], t1 = 0}, {p2 = {0/1}, t2 = 1}, [p3 = [1/0], t3 = 1}, {p4 = [1/1], t4 = 0 }
Who published what paper in what year that acted as a solution to the perceptron problem?
In 1986, Hinton, Rumelhart, and Williams published a paper “Learning representations by back-propagating errors”, introducing backpropagation and hidden layers concepts — therefore so to speak giving birth to Multilayer Perceptrons (MLPs)
What are the key components of MLP’s?
- Backpropagation, a procedure to repeatedly adjust the weights so as to minimize the difference between actual output and desired output
- Hidden Layers, which are neuron nodes stacked in between inputs and outputs, allowing neural networks to learn more complicated features (such as XOR logic)
What do MLP’s consist of?
MLP’s are composed of an input layer to receive the signal, an output layer that makes a decision or prediction about the input, and in between those two, an arbitrary number of hidden layers that are the true computational engine of the MLP.
What does MLP training involve?
Training involves adjusting the parameters, or the weights and biases, of the model in order to minimize error. Backpropagation is used to make those weigh and bias adjustments relative to the error, and the error itself can be measured in a variety of ways, including by root mean squared error (RMSE)
When does training of MLP end?
The network keeps playing that game of ping-pong until the error can go no lower. This state is known as convergence.
What shared properties do perceptron’s have with neurons?
cells with adjustable-strength synaptic inputs of competing excitatory and inhibitory influences that are summed and compared against a threshold. If the threshold is exceeded, the cell fires. If not, the cell does not fire