Lecture 4 Lecture Notes Flashcards
How do McCulloch and Pitts neurons function?
They sum the firing of incoming neurons multiplied by synapse weights and fire if the sum exceeds a threshold.
What does a perceptron consist of?
Sensory neurons connected to motor neurons.
What is the main learning rule for perceptrons?
Adjust weights based on the difference between actual and desired outputs.
What is the significance of the bias unit in a perceptron?
It allows the perceptron to create any dividing line needed for classification.
What boolean functions can perceptrons learn?
- AND
- OR
- NAND
- NOR
What is a limitation of perceptrons?
They can only learn linearly separable boolean functions.
Which boolean function cannot be learned by a simple perceptron?
XOR.
What is the equation used in perceptrons for output calculation?
~oi = step(ÂWij~xj).
What is a common value for the learning rate in perceptrons?
0.1.
What happens if the output neuron incorrectly produces a 1?
Decrease the weight for that neuron.
What happens if the output neuron incorrectly produces a 0?
Increase the weight for that neuron.
What are the outputs of a perceptron when given the inputs (0,1) and (1,0)?
1’s
What are the outputs of a perceptron when given the inputs (0,0) and (1,1)?
0’s
What type of functions can perceptrons learn?
Boolean functions which are linearly separable
For a 2-variable input, how can the 1’s and 0’s be divided?
With a straight line
For a 3-variable input, how can the 1’s and 0’s be divided?
With a plane
What did Minsky and Papert predict about advanced forms of perceptrons?
They were unlikely to escape the problem of linear separability
What was the impact of Minsky’s reputation on the field of neural networks?
It wiped out the entire field for over a decade
Can multilayer neural networks learn functions beyond linear separable ones?
Yes, they can learn any function
What is the perceptron considered in terms of classification?
A binary classification algorithm
What does a neural network do beyond classification?
Learns a continuous function from one multidimensional space to another
What are the two generalizations made to the single-layer perceptron?
- Change the step function to a differentiable function
- Define a formal learning algorithm in terms of gradient descent
Why must the initial weights of a multilayer neural network be small random values?
If all weights are 0, the network cannot learn
What is the delta rule used for in neural networks?
Modifying weights based on their contribution to the final outcome
What is the sigmoid function defined as?
s(u) = 1 / (1 + e^(-bu))
What does the parameter ‘b’ affect in the sigmoid function?
The slope of the curve
What is the derivative of the sigmoid function?
s’(u) = s(u)(1 - s(u))
What do multilayer neural networks allow for in terms of function learning?
Learning any continuous, differentiable function
What is the role of the hidden layer in a multilayer neural network?
It allows for more complex function learning
How is the output of a neuron in a multilayer network defined?
o_i = s(Σ(W_ij h_j))
What is the purpose of the backpropagation learning rule?
To adjust weights based on the error in output
What does the error metric E represent?
E = 1/2 Σ(y_i - o_i)²
What happens if you increase the number of neurons in the hidden layer too much?
The network may overfit and not generalize well
What is the goal of training a neural network?
To minimize the error metric
True or False: Backpropagation guarantees convergence for all input cases.
False
What effect does lowering the learning rate have on a neural network?
It takes longer to learn but increases the chances of finding the global optimum
What is the overall procedure for training a backpropagation neural network?
- Pick an input/expected output vector pair
- Present the input to the network
- Read the network’s output
- Modify each weight using the delta learning rule
- Repeat
What is the significance of generalization in neural networks?
It allows the network to apply learned rules to unseen inputs
What is the relationship between the delta rule and gradient descent?
The delta rule moves weights in the direction that reduces error based on the gradient
What is the error metric required to be zero for zero error?
It must have a mean squared error form
What happens when the network is caught in a suboptimum solution?
It may fail to converge for some inputs
What is the backpropagation learning rule used for?
To update weights in a neural network based on the error between predicted and actual outputs
The rule helps in minimizing the error during training by adjusting weights accordingly.
What is the formula for updating weights DWij?
DWij = a(~yi - ~oi)~oi(1 - ~oi)hj
Where a is the learning rate, ~yi is the target output, ~oi is the actual output, and hj is the hidden unit.
What does DVjk depend on in the backpropagation algorithm?
The value of Wij
This means that Wij should not be changed until after DVjk is computed.
How is the weight update DWij expressed in matrix form?
DW = a ((~y - ~o) ⌦ ~o ⌦ (1 - ~o)) h
This represents the element-wise multiplication in the weight update process.
What is the purpose of the stopping criterion ‘d’ in the error backpropagation algorithm?
To stop the algorithm when the outputs of the neural network aren’t changing significantly anymore
This criterion helps avoid unnecessary iterations once convergence is achieved.
What is the learning rate ‘a’ used for in weight updates?
It controls how much the weights are adjusted during each update
Smaller values are typically preferred to ensure stable convergence.
What is a Hopfield network used for?
To simulate associative memory
It allows the retrieval of stored patterns based on partial or noisy inputs.
What does Hebb’s Rule state about synaptic strength?
If neuron A fires and neuron B fires in response, the strength of the synapse between them increases
This rule is foundational to understanding learning mechanisms in neural networks.
What is the difference between Hebb’s Rule and the Delta rule?
Hebb’s Rule updates weights based on correlation between nodes, while the Delta rule updates based on the error of the output node
This reflects different learning strategies in neural networks.
What happens if weights in a Hopfield network are set to learn multiple patterns?
The weights are set to the sum of correlations for all items to be stored
This allows the network to retrieve the closest memorized vector when presented with an input.
What is the capacity of a Hopfield network with N neurons?
Approximately 0.138N vector patterns
Exceeding this limit leads to degradation in performance.
What is the first step in the error backpropagation algorithm?
Initialize the matrices V and W with small random values centered at 0
Proper initialization is critical for effective training.
Fill in the blank: The output of a two-layer neural network is computed using the formula _______.
s(W~h)
Where ~h is the output from the hidden layer and W is the weight matrix.
True or False: The Hopfield network is a fully-connected, feedforward neural network.
False
Hopfield networks are recurrent, meaning they allow connections between neurons that can loop back on themselves.
What is the learning rule for weights in a Hopfield network?
Wij = (1/N) Σ x(p)(i) x(p)(j)
This rule sums the correlations of all input patterns to determine weight strength.
What is the capacity of a Hopfield network with N neurons?
About 0.138N vector patterns
Exceeding this capacity leads to degradation in performance.
What happens when a Hopfield network stores more than its capacity?
Degradation occurs, causing similar learned patterns to mix together
This is analogous to how human memory degrades under information overload.
What type of algorithm does a Hopfield network use?
Nearest-neighbor algorithm
What is competitive learning in neural networks?
Neurons compete to categorize inputs based on proximity
The closest neurons to the input become the strongest.
What is the structure of a simple competitive learning network?
Input neurons feed into output neurons, which are self-connected and have inhibitory connections to each other
What does the output of the neurons in a competitive network depend on?
The sum of their inputs weighted by edge weights
What occurs when one output neuron dominates in a competitive network?
It is designated as the winner, while the outputs of other neurons decrease
What is the learning rule for the competitive network?
Weights are adjusted only for the winning output neuron based on the input
Fill in the blank: A neuron that never gains enough strength to win is known as a _______.
Dead unit
What is leaky learning in neural networks?
Updating weights of both winner and loser neurons, albeit to a lesser degree for losers
What is lateral inhibition?
Output neurons inhibit each other to compete for dominance
This concept is derived from neuroscience and helps in pattern recognition.
What is feature mapping in competitive networks?
A type of lateral inhibition that only applies to nearest neighbors of the output neuron
What does a self-organizing map do in neural networks?
Clusters inputs into spatially related categories
What is Kohonen’s algorithm?
A neural network that clusters input data into output without lateral inhibition
What is the initial weight setting in Kohonen’s network?
Small random values
How does Kohonen’s network determine the winning neuron?
By selecting the neuron with the largest output value
What does the neighborhood function L(i, i*) in Kohonen’s algorithm represent?
The proximity of neuron i to the winning neuron i*
It is similar to a bell curve centered at i*.
What happens to the weights of output neurons in Kohonen’s network based on their proximity to the winner?
Weights are modified more for neurons close to the winner and less for those further away