College 6: Social Learning (Reinforcement learning) Flashcards
probabilistic outcomes
the same stimulus or action does not always lead to the same outcome
R (reward)
reward, also called outcome (positive or negative feedback)
V (value)
the expected outcome of a stimulus or action
PE Delta
prediction error, the difference between the actual reward and your expected reward R - V
alpha
learning rate, the speed of learning
dopamine
study of dopamine with monkeys and dogs
pavlovian dog when offered food, dopamine is responsive to expectations rather the actual outcome. Dopamine gets more active when the dog gets unexpected food offered.
When learned, the spike in neurons is active when the bell rings
when it’s not given you see less dopamine
pavlovian conditioning
associated more with the ventral striatum
operant/instrumental conditioning
associated more with dorsal striatum
social reinforcement learning
Study of Jones in which you’re learning from feedback of others (i.e. social feedback)
Cohen
is adolescence associated with unique changes in predication errors signals and expected values
probabilistic reinforcement tasks in the brain scanner in which they got abstract stimuli and had to classify this in northern or eastern.
children 8-12
adolescents 14-19
adults 25-30
predictable stimuli (83%)
random stimuli (50%)
with a reward magnitude
striatal prediction error signals peaked in adolescence - more dorsally located
value signals that the mPFC were highest in children
outcome:
- it showed that value and PE show a different developmental trajectory
- larger striatal PE’s might reflect a greater effect of positive outcomes in adolescents
- might suggest taht children learn less efficiently (stronger responses to reward but PE absence)
- Limitation was that they only studies positive prediction errors and not learning speed as well
van den Bos
2 stimuli for example an apple and a pear, when you choose the apple (A) you get 80% of the time positive feedback and 20% when you choose the pear.
Stimulus pair AB
A 80%
B 20%
Stimulus pair CD
C 70%
D 30%
the negative learning rate decrease with age which means that children learn faster from negative feedback from both adults and adolescents, and adolescents learn faster than adults
for positive learning rates this is the way around
outcome: VS (prediction error) and mPFC (reward/outcome) was stronger with positive prediction errors than negative.
The strenght became stronger with age
people who learned faster from negative feedback showed less connectivity between VS and mPFC
Jones study
social decision making, social status and sensitivity to social cues is changing during adolescence and social reinforcement learning could be important to study during adolescence
participants during 8-25 yo completed a personal survey and peers could either give positive feedback to the participant or to someone else. They manipulated the feedback
children and adults slowed down their responses when the continous peer (100% feedback) suddenly stopped giving notes > higher positive learning rates
adolescents showed more activity putamen and SMA when receiving positive notes receiving positive feedback regardless of their expectations
» adolescents may be more tuned to action-outcome learning
» period of sensitivity towards peer feedback
limitations:
- brain regions has way more functions
- reaction time to the wink in the left or right eye is what they used as learning. But this wasn’t necessarily linked to positive note
- used only positive and not negative
- other forms of social learning weren’t taken into account
- there is a followup study done at the moment
ventral striatum
reward magnitude
reward predictive stimulus
greater reactivity when a reward is different than expected
before learning: VS responds at the time of the reward
after learning: VS responds at the time of the predictive stimulus
- pavlovian conditioning
- learning from feedback
dorsal striatum
responds to reward but especially early in learning and when there is mapping between actions and outcome
learning from reward
operant/instrumental conditioning
OFC (part of mPFC)
associated with outcome value
value signs mPFC higher in children