14. Interpreting the Theory Flashcards
features of rewarding stimuli
Wolfram Schults (1997, 1998)
- midbrain dopamine release represents a reward prediction
- rewarding stimuli have 3 basic functions: (2 important)
1. rewards elicit approach and consummatory behaviour and serve as goals of voluntary behaviour - interpret ongoing behaviour, change priorities of behavioural actions, needs to bias future behaviour
- rewards have positive reinforcing effects and this increases the frequency and intensity of behaviour leading to such objects
- rewards induce sub subjective feelings of pleasure (hedonia) and positive emotional states
- rewards have positive reinforcing effects and this increases the frequency and intensity of behaviour leading to such objects
learning happens when rewards occur unpredictably and slows as a reward becomes more predicted
- reward driven learning depends on the “error:” between prediction of the reward and its occurrence (“coming back for more”)
Raster Plots
- represents the firing of neurons in time
- horizontal row = trial number (stimulus presentation)
- each dot/bar represents a neuron firing
- trials are aligned with stimulus onset
- electrodes placed at level of the VTA (DA neurons firing) extracting AP’s
learning
= comparing received reward with expected reward
- if an action produces an unpredicted negative outcome, it is unlikely to be repeated
- no further learning takes place when a reward is predicted by sensory cues
- fits both operant and classical conditioning
- but is this reflected in DA neurons?
extracellular electrophysiological activity in monkeys
Schults et al., (1997) = raster plots
- recorded from midbrain DA neurons (VTA and Substantia nigra [SN]), recording extracellular activity whilst monkeys performed behavioural tasks
- naive monkey required to touch lever following presentation of a light
- touching the lever delivers juice
- after a few days, monkey learns to reach for lever as soon as the light comes on
- used raster plots in VTA (to measure DA neurons AP’s)
Schultz et al (1997) - before learning
- drop of liquid occurs which isn’t predicted
- reward prediction error = DA neuron shows activation (in VTA) in raster plots, seeing a spike in DA firing after reward
Schultz et al (1997) - after learning
- conditional stimulus predicts a reward
- no reward prediction error so DA neuron fires after the reward-predicting stimulus (CS = light)
- DA neurons do not fire after the presentation of a reward (juice)
Schultz et al (1997) - after learning (no reward)
- there is tonic firing of DA neurons
- if the reward fails to occur there is a negative reward prediction error
- DA neurons fire after CS (reward predicting stimulus - light)
- but are depressed at the time of reward would have been given
dopamine response =
reward occurred - reward predicted
Schultz et al (1997) results
- the reward prediction error (RPE) acts as a teaching signal that is used to correct inaccurate predictions
- unpredicted reward (or something thats better than expected) = positive prediction = strengthen
- presentation as predicted = no new learning (DA fires after light - reward predicting stimulation)
- omission of predicted outcome = negative prediction = extinction of behaviour
RPE and Blocking
- conditioning is impaired if a CS is presented together with a second CS that has already been associated with the response
- CS1 = fully predicts arrival of the reward
- CS2 = no change in RPE error = no learning
- blocked stimuli does not become associated with reward but unblocked stimuli does
- when an association as been made between reward and CS, when a stimulus isn’t presented there is a depression in DA firing, when there is a reward there is no DA response after reward given
- RPE = DA release = Learning (same thing when there is no reward)
- Enomoto et al (2011)
Enomoto et al (2011)
- stepwise transfer
- if multiple actions are needed for the reward the DA response suggests that each bit is learnt at a time
- mid brain neurons transfer the right response to other stimuli (e.g. CS2)
RPE and risk taking
- many naturalistic rewards have elements of risk
- Stauffer et al (2014) (gambling)
Stauffer et al (2014)
- risk in lab measured with binary gambles
- monkey cued to make reward based decision
1. 50:50 gamble large:small amount of juice
2. guaranteed smaller amount of juice - when risks are low monkeys gamble for bug rewards, when risks are high they take small rewards
- midbrain DA response the same, succecful gambles produce more positive RPE
- higher risk gambles are associated with:
1.
2. - evolutionary advantage to gamble when risk low but not when high
- risk management = reduction encoded by reduction of DA activity after high risk gamble
what are higher risk gambles associated with
- high midbrain DA activity associated with cue
2. REDUCED midbrain activity at time of reward