Reward Processing Flashcards
What is reward prediction error?
When a predicted reward does not equal the reward obtained (unpredicted or surprising)
What are the consequences of prediction error?
Positive = new learning; zero = no new learning; negative = extinction
Neurons in which brain region produce dopamine, which help us encode reward processing?
Ventral tegmental area (in the mid brain)
Which parts of the frontal lobes are associated with reward processing?
OFC, medial and dorsal PFC
In which regions does the dopaminergic pathway travel through?
Between midbrain, striatum and cortex (ventral tegmentum; OFC; PFC; basal ganglia + nucleus accumbens + globus palidus in striatum; substantia nigra; down to brain stem
Dopamine neurons (involved with reward prediction and error detection) travel specifically to medial temporal cortex, dlPFC, premotor, parietal cortex, OFC, striatum, and amygdala. What are each of these regions associated with?
Medial temporal - reward detection/prediction; dlPFC, premotor and parietal - goal representation; OFC - relative reward value/reward expectation; striatum - reward detection/goal representation; amygdala - emotions/conditioned effects
What occurs in the brain when performing a sequence of steps leading to a rewarding outcome, such as the process of making a coffee?
There’s an increasing amount of dopamine concentration; there’s an inherent reward leading to the outcome, even though the process may not be rewarding in itself
Compare what happens in the synapses when taking cocaine to methamphetamine
Cocaine blocks the dopamine transporter so hinders reuptake in the pre-synaptic neuron; concentrations are increased in the synaptic cleft, leading to a continuous feeling of reward; Meth passes through the terminal buttons and gets mixed in with the dopamine, passes across the membrane and blocks pre-synaptic cleft, also blocking reuptake
When measuring the activity of a ventral tegmental neuron in a monkey brain, we see an increase of action potentials firing in this region straight after a reward (apple) is obtained, and no activity when there is no reward and they only touch a wire. What happens when the reward is delayed?
Whilst doing a picture discrimination task, the monkey’s learnt the reward will be delivered after 1 sec, but when they make him wait, there’s a delayed response in activity; bursting of action potentials changes depending on when reward is delivered (temporally precise)
What was discovered when measuring a putamen neuron’s activity in a monkey brain based on expectation during learning, and what does this suggest?
The monkey executed a movement to obtain a reward and showed an increase in the build up period (anticipation); also rewarded for inhibiting a movement (same effect); when movement is unrewarded, shows initial trial and error (neurons generating signals) until he learns there’s no reward and anticipatory signals disappear; suggests anticipation shapes behaviour
Describe how activity of an orbitofrontal neuron in a monkey’s brain was associated with relative reward preference
During a pattern discrimination task, the monkey showed less activation for reward B (apple) when compared to A (raisin); but activations increased for reward B (as much as it had for A) relative to reward C (cereal); showing preference changes according to relativity
Compare the caudate activity in a monkey’s brain when making an uncued movement for reward vs. non-reward
A large increase in activity in anticipation of reward, during movement onset and at reward onset, but no activity for unrewarded movement
Kristjansson et al. investigated how reward influences “pop-out” during a visual search, where Ps had to determine whether the target (green or red diamond) had the top or bottom missing. What were the conditions and results?
Green target = high reward (10 points 75% of time/1 point 25% of time); red target = low reward (1 point 75% of time/10 points 25% of time); RTs much faster for high reward condition; evidence of priming effects
In Kristjansson et al.’s visual search task, what was found when looking at priming effects from preceding to current trial, and what does this suggest?
Largest priming effect when high reward was likely (green target) and high received; smallest priming effect when low reward likely (red target) and low received; same effect for high likely and low received, or low likely and high received (in middle); shows what we expect to happen and what actually happens influences behaviour
What happened in Kristjansson et al.’s study when they reversed the reward contingencies 3 times per 200-trial block (red - high; low - green), and what does this show about human behaviour?
Opposite effects; after the point of change in reward schedule, RTs for red targets got faster, and slower for green targets; Shows how behaviour can track reward contingencies even when unaware of the change; many of our behaviours are motivated implicitly