lecture 5 - ANNs Flashcards

1
Q

Santiago Ramon y Cajal

A

found that the brain is not a single continuous network, but is made up of discrete units called neurons

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

structure of neurons

A

is related to their function in information processing: mapping inputs to outputs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

neural network in the brain

A
  • neurons don’t work in isolation, but are connected, forming a network
  • the input of one neuron is the output of another
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

visual information in the brain

A
  • information flows through different levels of networks
  • in the visual system this creates a hierarchy
  • lower levels are closer to the retina, higher levels are closer to movement-output or memory
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q
  • biological intelligence → artificial intelligence
  • can we copy real intelligence architecture to make information-processing systems?
A
  • neurons are various and too complicated to fully copy
  • a good model is simple, but retains the basic characteristics of information processing
  • this abstraction ignores the biological complexities but keeps the essence of how neurons map inputs to outputs
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

how signals arrive and are processed in neurons

A
  • signals arrive in dendrites
  • inputs can be excitatory (encourage firing) or inhibitory (suppress activity)
  • this leads to EPSPs and IPSPs
  • the neuron integrates all incoming EPSPs and IPSPs in the soma to decide whether to fire an action potential
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

excitatory post-synaptic potentials (EPSPs)

A

increase the likelihood that the neuron will fire an action potential

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

inhibitory post-synaptic potentials (IPSPs)

A

decrease the likelihood that the neuron will fire an action potential

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

dendritic mechanisms

A
  1. spatial summation
  2. temporal summation
  3. excitation vs inhibition
  4. attenuation
  5. integration at the soma
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

dendritic mechanism: spatial summation

A
  • EPSPs combine across space
  • meaning that multiple EPSPs from different synapses on the dendritic tree can combine as they travel toward the soma
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

dendritic mechanism: temporal summation

A
  • EPSPs combine across time
  • EPSPs from the same synapse can combine if they arrive in quick succession
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

dendritic mechanism: excitation vs inhibition

A
  • EPSPs and IPSPs interact in the dendritic tree
  • so, IPSPs can cancel EPSPs
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

dendtritic mechanism: attenuation

A
  • potential changes attenuate as they travel from dendrites to soma
  • potentials lose strength due to the physical properties of the dendrites (e.g., resistance).
  • the farther the synapse is from the soma, the weaker its signal when it arrives.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

dendritic mechanism: integration at the soma

A
  • at soma, action potentials initiate
  • the soma integrates all incoming signals (spatially and temporally summed EPSPs and IPSPs).
  • if the combined signal reaches a threshold, the neuron generates an action potential, which travels down the axon to communicate with other neurons
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the core computational principle that the neuron implements?

A
  • input-output transform
  1. takes multiple incoming signals (inputs) from dendrites
  2. processes (sums) these signals in the soma
  3. produces an output (an action potential) that travels down the axon to other neurons
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

what do we trow away to model neurons

A
  1. dynamics
  2. temporal integration
  3. spatial complexity
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

similarity between a biological neuron and perceptron

A

both transform input to output

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

what does a perceptron do

A
  1. takes real valued inputs
  2. scales each input by a (synaptic) weight
  3. integrates the inputs by calculating the sum of the weighted inputs (results in a dot product)
  4. passes this through an activation function that compares activation (dot product) to a threshold θ
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

perceptron in one sentence

A

computes a weighted sum of inputs and compares it to a threshold to produce a binary output

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

perceptron threshold

A
  • if the weighted sum of the input exceeds the threshold, the output is 1
  • otherwise it is 0
21
Q

perceptron weights

A

control how important each input is

22
Q

perceptron bias

A

adjusts the flexibility or strictness of the decision boundary (threshold)

23
Q

similarity between a weight and a synapse

A

the value of the weight is similar to the strength of each of the synapses in the tree of a biological neurons

24
Q

similarity between an activation function and a neuronal mechanism

A

the decision to fire or not

25
Q

what type of model is the perceptron

A
  • a classifier that decides whether some pattern of inputs is present or not
  • gives a binary output, so it can only treat linearly separable problems
26
Q

what determines the decision of a perceptron

A

the weights (parameters)

27
Q

similarity between a perceptron and a receptive field

A
  • a biological neuron fires if there is a specific orientation (pattern) in the input
  • a perceptron fires when there is a specific pattern present in the input data
28
Q

possible functions of a perceptron

A
  1. boolean OR function
  2. boolean AND function
  3. emphasize one input over another
29
Q

boolean OR function

A
  • either x_1 or x_2 or both are activated
  • we change w_0 = θ
  • a low θ (0.5) ensures any positive input triggers a response
30
Q

boolean AND function

A
  • x_1 and x_2 are both are activated
  • we change w_0 = θ
  • a higher θ (1.5) ensures a trigger only when both inputs are positive (stricter)
31
Q

emphasize one over the other

A
  • when you want to know what happens in x_1 but not necessarily in x_2
  • we change w_0 = θ and the other weights
  • increase the weight of the more critical input while keeping a suitable threshold
32
Q

multidimensional perceptron

A
  • a perceptron that handles inputs with more than two dimensions
  • e.g., 10x10 image that has 100 inputs of x_i
33
Q

How is a weight matrix structured in a multidimensional perceptron?

A
  • The weight matrix W_ji maps inputs to outputs.
  • For example, in a 10x10 image with 26 outputs, W_ji would be a 26×100 matrix where each weight determines the contribution of a pixel to a specific output.
34
Q

What does W_{3,57} represent in the weight matrix?

A

It is the weight connecting input pixel 57 to output neuron 3 (which could represent, for example, the letter “C”).

35
Q

training a (multidimensional) perceptron

A
  • done through gradient descent
  • adjusts weights by minimizing a cost function like MSE that quantifies how wrong the model is
36
Q

gradient descent: activation function

A
  • sigmoid function
  • because it is smooth and differentiable, mapping inputs to a range between 0 and 1, making it suitable for calculating gradients during optimization
37
Q

What is the goal of gradient descent in training a perceptron?

A

To minimize the error 𝐸 by adjusting weights w_ji in the direction that reduces the error.

38
Q

How is the change in weight Δw_ji calculated during gradient descent?

A
  • change in weight from output j to input i = learning rate * derivative of the activation function * error at the output neuron * input from the connected neuron
  • Δw_ji =−η⋅g′(a_j)⋅(y_j−t_j)⋅x_i
39
Q

gradient descent: a weight from j to i is updated, proportional to

A
  1. η: step size
  2. (y-t): how wrong the output of the neuron was
  3. g’(a): how much a weight change will affect the output
  4. x: whether there was any input at all
40
Q

gradient descent: increasing or decreasing the weight

A
  • if y_j > t_j: decrease the weight
  • if y_j < t_j: increase the weight
41
Q

gradient descent: delta rule

A
  • the local error contribution
  • δ_𝑗 = g′(a_j)⋅(y_j−t_j)
  • similar to the error term used in both reinforcement learning and fitting algorithms
42
Q

gradient descent: update recipe

A
  1. present input, comput output y = g(wx)
  2. compare the output to compute the error
  3. the ‘local contribution’ to the error is the δ of the node; δ = g’(a) ⋅ (y-t)
  4. use δ_j and x_i to update the weight slightly; Δwji = −ε⋅δ_j⋅x_i
43
Q

perceptron problem

A

A perceptron cannot solve problems that are not linearly separable, such as the Boolean XOR function

44
Q

what is XOR

A
  • the output is 1 only when one input is 1 and the other is 0.
  • when both inputs are 0 or both are 1, the output is 0.
45
Q

Why can’t perceptrons solve the XOR problem?

A
  • XOR requires separating data points in a way that a single line cannot achieve
  • e.g., in (0,0) and (1,1) belong to one class (y=0), while (0,1) and (1,0) belong to another class (y=1).
  • these points cannot be divided into two groups by a single straight line, as they are crossed opposites
46
Q

What is the solution to the perceptron’s inability to solve XOR?

A
  • adding a hidden layer that introduces additional intermediate nodes
  1. Adding a hidden layer with intermediate nodes (e.g., h1 = either and h2 = both, output responds to only h1) that transform the input into a space where XOR is linearly separable.
  2. addine a hidden layer with alternative configurations (e.g., h1 responds to x1, h2 responds to x2, output responds to either h1 or h2)
47
Q

What is the role of intermediate nodes in hidden layers?

A

They transform inputs into a higher-dimensional feature space where the problem becomes linearly separable.

48
Q

Multi-layer networks

A
  • are universal function approximators, not just
    for classification.
  • they can solve any problem with the right configuration
  • the problem now is to find the right configuration