Perceptron Flashcards

1
Q

Who came up with the perceptron and what year?

A

Rosenblatt 1957

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What does the perceptron model do?

A

The perceptron model takes an input, aggregates it (weighted sum) and returns 1 only if the aggregated sum is more than some threshold else returns 0.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What can a single perceptron do?

A

A single perceptron can only be used to implement linearly separable functions, it takes both real and Boolean inputs and associates a set of weights to them, along with a bias (threshold)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How does the perceptron model allow you to perform pattern recognition?

A

allows you to find the w vector that can perfectly classify positive inputs and negative inputs in your data. We initialize w with some random vector. We then iterate over all the examples in the data, (P U N) both positive and negative examples. Now if an input x belongs to P, the dot product w.x will be greater than or equal to 0 and if x belongs to N, the dot product MUST be less than 0. Basically, perceptron uses dot product for pattern recognition.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What in particular is the linear perceptron not (and why in particular)?

A

Linear models like the perceptron with a Heaviside activation function are not universal function approximators; they cannot represent some functions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are linear models limited to?

A

Linear models can only learn to approximate the functions for linearly separable datasets

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What do linear classifiers do and why does this limit them?

A

find a hyperplane that separates the positive classes from the negative classes; if no hyperplane exists that can separate the classes, the problem is not linearly separable

many problems are not linearly separable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is an example of a non-linearly separable problem?

A

XOR gate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are the input/output target pairs for the XOR gate?

A

{p1 = [0/0], t1 = 0}, {p2 = {0/1}, t2 = 1}, [p3 = [1/0], t3 = 1}, {p4 = [1/1], t4 = 0 }

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Who published what paper in what year that acted as a solution to the perceptron problem?

A

In 1986, Hinton, Rumelhart, and Williams published a paper “Learning representations by back-propagating errors”, introducing backpropagation and hidden layers concepts — therefore so to speak giving birth to Multilayer Perceptrons (MLPs)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the key components of MLP’s?

A
  1. Backpropagation, a procedure to repeatedly adjust the weights so as to minimize the difference between actual output and desired output
  2. Hidden Layers, which are neuron nodes stacked in between inputs and outputs, allowing neural networks to learn more complicated features (such as XOR logic)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What do MLP’s consist of?

A

MLP’s are composed of an input layer to receive the signal, an output layer that makes a decision or prediction about the input, and in between those two, an arbitrary number of hidden layers that are the true computational engine of the MLP.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What does MLP training involve?

A

Training involves adjusting the parameters, or the weights and biases, of the model in order to minimize error. Backpropagation is used to make those weigh and bias adjustments relative to the error, and the error itself can be measured in a variety of ways, including by root mean squared error (RMSE)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

When does training of MLP end?

A

The network keeps playing that game of ping-pong until the error can go no lower. This state is known as convergence.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What shared properties do perceptron’s have with neurons?

A

cells with adjustable-strength synaptic inputs of competing excitatory and inhibitory influences that are summed and compared against a threshold. If the threshold is exceeded, the cell fires. If not, the cell does not fire

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What was the original perceptron based off of?

A

The original Perceptron was conceived as a model for the eye. Patterns to be recognized, or classified, are presented to a retina, or layer of sensory cells. Connections from the sensory cells to a layer of associative cells perform certain (perhaps random, perhaps feature detecting) transformations on the sensory pattern. The associative cells then act on a response cell through synapses, or weights, of various strengths. The firing, or failure to fire, of the response cell performs a classification or recognition on the set of input patterns presented to the retina.

17
Q

The Perceptron shows a rudimentary ability to

A

learn

18
Q

If a Perceptron is given a set of input patterns and is told which patterns belong in class I and which in class 0

A

the Perceptron, by adjusting its weights, will gradually make fewer and fewer wrong classifications and (under certain rather restrictive conditions) eventually will classify or recognize every pattern in the set correctly. The weights usually are adjusted according to an algorithm similar to the following.

  1. If a pattern is incorrectly classified in class 0 when it should be in class 1, increase all the weights coming from association cells that are active.
  2. If a pattern is incorrectly classified in class 1 when it should be in class 0, decrease all the weights coming from association cells that are active.
  3. If a pattern is correctly classified, do not change any weights.
19
Q

Four features of this algorithm are common to all Perceptron training algorithms, and are essential to successful pattern recognition by any Perceptron-type device

A

(1) Certain selected weights are to be increased, others decreased.
(2) The average total amount of increase equals the total amount of decrease.
(3) The desired classification, together with the pattern being classified, governs the selection of which weights are varied and in which direction.
(4) The adjustment process terminates when learning is complete.

20
Q

What are the limitations of the Perceptron in the nervous system?

A

The Perceptron works quite well on many simple pattern sets, and if the sensory-association connections are judiciously chosen, it even works on some rather complex pattern sets. For patterns of the complexity likely to occur in the nervous system, however, the simple Perceptron appears to be hopelessly inadequate. As the complexity of the input pattern increases, the probability that a given Perceptron can recognize it goes rapidly to zero

21
Q

What are the limitations of the perceptron model (3)?

A
  • One-layer feed-forward network (nonrecurrent);
  • Only capable of learning solution of linearly separable problems;
  • Its learning algorithm (delta rule) does not work with networks of more than one layer.
22
Q

Input is typically

A

a feature vector x multiplied by weights w and added to a bias b: y = w * x + b.

23
Q

In what way is the perceptron a linear classifier?

A

classifies input by separating two categories with a straight line.