Module 1: Perceptrons & MLPs Flashcards
What is a neural network?
A parametric model that takes data as input and generates an output by applying several consecutive operations on the data.
What determines how the parameters of a neural network are adjusted?
The difference between the generated output and the expected output
What is a perceptron?
The basic building block of the neural network; A neural network that works by taking several binary inputs, x1, x2, …, xn, and produces a single binary output.
How is the output of a perceptron computed, and what is this called?
It’s determined by whether the weighted sum is < or > some threshold value. This Is referred to as the activation function.
What is the step activation function, and how does it work?
A step activation function is an activation function that checks if the sum of the inputs * weights is > or < 0. If > 0, the NN outputs 1, If < 0, the NN outputs 0.
What is the role of the bias term?
It acts as a threshold for the activation function.
How does adjusting the bias term (w0) affect the step activation function?
Increasing w0 (aka. bias > 0) makes the neuron more sensitive to the bias term and makes activation (1) more likely. Decreasing w0 (aka. bias < 0) makes the neuron less sensitive to the bias term and makes activation (1) less likely.
What are the components of a perceptron and a multi-layer perceptron?
Perceptron has a weighted input layer and an output layer. A multi-layer perceptron has an input layer, at least one hidden layer, and an output layer
What type of classifier is a perceptron? How does it work?
A perceptron is a linear classifier. It works by drawing a hyperplane that separates 2 areas of the space linearly.
In training a perceptron, what is the error?
The difference between the expected output and the produced output.
What is the error used for in training a perceptron?
To update the parameters & weights
What is the learning rate?
A hyper-parameter that defines the pace at which the network updates its parameters/ learns.
(smaller values = slower learning, probability to get stuck @ local minima, better convergence to a minima; larger values = faster learning, possibility to escape local minima, may cause oscillations)
What formulas define the perceptron update rule?
The change for a weight = learning rate * error * input. It is applied by adding current weight + change for the weight
What are the limitations of a perceptron?
Can only separate space linearly
Explain the limitations of a perceptron using an XOR logical gate, and a possible solution.
No single straight line can be used to separate the points.
2 perceptrons can be used to separate the points; any input between the 2 lines belongs to 1 class, and the rest to the other.