Lecture 9 - Adaptive Control Flashcards
most ____ skills improve with practice
sensorimotor
Proof that VOR learns to adjust
VOR restores retinal-image stability when vision is altered by lenses - magnifying glasses take ~ 30 min for VOR to grow stronger - minimizing glasses makes VOR grow weaker - glasses that flip the world upside down take about a week for VOR to adjust (VOR changes direction)
what happens to saccades when eye muscle is damaged?
saccades miss target and drift
if eye muscle damage isn’t too sever, what happens to saccades?
neural adaption restore saccade accuracy and eliminate drift, even if the muscle is damaged
controller must know plant well, but…
we do not have to be born with accurate knowledge of our plant - plant changes through life - controller learns
How do control networks adjust themselves to improve performance?
- error-driven learning
- Learning by perturbation
- Gradient-descent learning
controllers learn the properties of their plants based on…
sensory feedback (learn from trying, examples…)
what is the aim of learning?
minimize average error (aka risk, aka expected loss)
error (e) =
y - y*
loss (L) =
|e|2/2
What does the learner want?
minimize error & average loss E(L)
want both these to get 0
risk depends on…
probabilities of different situations (not every input has the same loss)
how does the brain estimate risk?
Learning by perturbation
learning by perturbation
make small changes to the weights and accept the ones that reduce error;
wi + η
If |epert| < |e|, then the perturbed weights are accepted
each decision to accept or reject perturbed weights is based on…but…
single input z, but overtime the neuron samples many inputs and the OVERALL risk is reduced
What is the learning algorism that samples many weights and calculates the error every time?
weight perturbation
What happens when η is too big? Too small?
big: learn faster, not accurate (can’t zero in)
small: learn slower, very accurate (steps are small)
weight perturbation formula
wi ← wi + η(randomi - 0.5)
Gradient-descent learning
find the minimum on a weight by weight plot
equation for gradient-descent learning
Δw = -ηez
What is Widrow-Hoff learning?
Gradient-descent learning
Advantage of Widrow-Hoff learning
compute which way to go instead of guess (like perturbation learning does)
Why is WH learning faster?
exploits knowledge of the network: neuron is linear and error is a linear function of the neruon’s out put signal
learning rate of WH depends on …
η
too big: unstable, may overstep low ground
too small: slow
how to learn nonlinear functions?
- weight perturbation (but too slow!!)
- error-backpropagation (generalized WH)
backprop
gradient-descent algorithm that trains layer networks of non-linear cells;
each cell in each layer computers its own aL/aw and sense info upstream;
given appropriate signals from all its downstream cells, a neuron can compute its own aL/aw
argument against backprop
takes precise communcation between layers to computer aL/aw for synapses deep in the network -> possibly not possible in the brain?
what came about from backdrop being too complex?
shallow learning
shallow learning
- linear output cells
- outer layer of neurons is adjustable
- all other synapses are frozen (weights don’t change)
curse of dimensionality
number of cells & synapses rise exponentially with dimension of the functions input space
how to cope with curse of dimensionality?
- have lots of synapses (seeds, spores, sperm, immune system, granuel cells in cerebellum)
- prior knowledge