Adaptation in embodiment: The role of learning in intelligence Flashcards
What is adaptation in embodied agents? Why do we need it?
An embodied agent is an intelligent agent that interacts with the environment through a physical body within that environment.
Adaptaion the agent’s ability to learn from past experience???
Baby example: Learning from falling
How can we add/what techniques can we use for adaptation in embodied agents?
Use rewards: Positive and negative rewards is the main way teach a system how to perform an action without explicitly tell it how to do it.
Example: A baby’s progression of learning: Babies don’t know how to walk. No direct feedback (possible reward signal: smile or clap). Main feedback is when it falls (pain) => goal = avoid pain => learn how not to fall (i.e., how to walk).
What are the basic paradigms for learning?
- Supervised learning
- Unsupervised Learning
- Reinforcement Learning
What is supervised/unsupervised/reinforcement learning? Can you give an example (in either biological or artificial agents)?
- Supervised Learning:
○ Relies on a “teacher” = Labeled data
○ Map labelled data to known output
○ Regression and classification - Unsupervised Learning:
○ Unlabeled data
○ Identify patterns in data
○ Data clustering - Reinforcement Learning:
○ Relies on the “designer”: No pre-defined data
○ Learn to minimise cost functino via trial-and-error
○ Video game AI, sensorimotor skill learning
Example: Reinforcement Learning: Baby:
○ Don’t know how to walk. No direct feedback (possible reward signal: smile or clap). Main feedback is when it falls (pain) => goal = avois pain => learn how not to fall (i.e., how to walk).
○ Positive and negative reward is the main way to teach a system how to perform an action without explicitly tell it how to do it.
What is a neuron? What does it do?
A basic biological learning unit.
Neurons receive electrical signals as inputs from other neurons, integrate them together and send resultant electrical signals as outputs to other neurons.
What is a synapse? What is its role?
A structure that permits a neuron to pass information to another neuron (not a physical connection).
It controls the flow of the electrical signal from the pre-synaptic neuron to the post-synaptic neuron (synapses = where learning happens).
What is Hebbian Learning? What are the drawbacks of hebbian learning?
Hebbian Learning:
○ Mathematical model for capturing the timing information as a learning algorithm.
○ Weight, w, represents the synapse - Learning = update/change the weight term, w
○ Most fundamental idea in any ML: Input multiplied by a scalar and forward it to the next neuron.
Drawbacks:
○ The change in synaptic weight is always a positive number => weight always increases => unstable (value overflow => loose learning)
○ Unsupervised algorithm - no way to know when to stop
○ Only considers scalar terms - meaning we do not consider how signals change over time (don’t consider the dynamics of the signal)
Why is Hebbian learning unstable?
- Unsupervised => don’t know when to stop the algorithm
- Weight always increases => overflow of stored value (value becomes too large to store in a variable on a computer)
What is differential Hebbian learning? Why does it use derivatives?
- Model to capture timing information between input and output.
- Uses derivative of signals to detect the temporal difference between them => tells the rate of change
- By considering the dynamics as well as the statics => can capture what we see in biological neurons as a learning rule.
Models:
- Kosko Model: symmetric in time. Problem; only consider derivatives => can’t differentiate between timing information.
- Porr-Wörgötter model: Saves scalar terms (input) and the dynamics (derivative of output). Assymetric in time. More acurate compared to Kosko-model: Positive change in weight when timing information is positive. Negative change in weight when timing information is negative)
What is a perceptron? How does it work?
- A basic artificial learning unit
- Most basic neuron you can design in neural networks
- No timing information, no cross-correlation, no dynamics => the perceptron as a learning algorithm only works on data that is static (does not consider dynamics, unless you specify the input as the derivative of it instead)
- How does it work:
○ Inputs goes through one synapse => weighted input
○ Integration: sum of weighted inputs => value after integration = weighted sum
○ Thresholding: Pass activation value, z, through activation function: sigmoid, ReLU
What is gradient descent?
- Update each weight in proportion to its “contribution” to the squared error E between correct output t and current output v
○ Contribution is defined/measured as the amount of change in error for a given change of the weight (derivative of the error with respect to the weight). - Goal is to reach the minimum error (lowest possible error)
- Process:
○ Calculate gradient
○ Update weights - Supervised learning: To determine the error, we need to know the correct output for each possible input in advance.
How is learning in a perceptron different than learning in a biological neuron?
- Perceptron: Learning happens when the input weights are changed/updated.
○ The perceptron as a learning algorithm only works on data that is static (does not consider dynamics, unless you specify the input as the derivative of it instead. No timing information) - Neurons: Leaning in neurons is driven by correlated firing and the timing of these firings (timing = elapsed time between firings)
○ The sequence of the neuron spikes determine if the synapse gets stronger or weaker (binary value)
○ δt>0⇒ Synapse gets stronger
○ δt<0⇒ Synapse gets weaker
○ The elapsed time between the firings of two neurons determine how much the connection between them (the synapse) will be strengthened/weakened
What is an artificial/deep neural network? What are their advantages and disadvantages?
- Network of perceptrons arranged in layers
- Fully connected architecture: All neurons in one layer is connected to all neurons in the adjacent layers (???)
- Mapping: connect input to output
- ANNs are universal approximators:
○ Never give an exact solution = At best, give a good approximation
○ Only thing network cares about is input and output
○ Black box systems: we don’t know what happens inside the system and we don’t care as long as the mapping between the input and output is correct. - Advantages:
○ Enough training data => can learn any mapping between input/output if a solution exists. - Disadvantages:
○ The biggest bottleneck in neural networks is generating enough training data
○ Need data that consist of correct input and correct output.
○ Needs a lot of time to be trained compared to human brain. Example: self-driving cars VS human drivers. - Learning in ANNs/DNNs: Use gradient descent to update the weights
a. Calculate squared error for all output neurons and all neurons in the hidden layers
b. Update weights of all output neurons and all hidden layer neurons