Task 4: Reward Flashcards

1
Q

A1: Describe the structures that make up the reward system

Hint: basically motivational/limbic loop

A

Cortex: ACC, mOFC, vmPFC

BG: ventral Striatum

Midbrain: SNr/GPi (limbic territory)

Thalamus: mediodorsal nucleus (MD), VAmc, VLm

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

A1: Emphasize the functional similarity of the limbic loop with the motor loop

A

Between the loops:

  • Limbic loop: S-O (select facoured object based on expected reward)
  • Motor loop: S-A (appropriate motor programme)
  • -> both integrate some stimulus information
  • -> both involve prioritizing/resolving competition between different options
  • -> both include feedback signal via AMY & SNc which is necessary for learning
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

A1: Apply the Ballot Box Model to Motivation
Keywords: responses to rewards; addiction

A

Ballot Box model: Voting for a motivation

Ventral striatum –> direct/indirect pathway vote –> “vote” is moved on to cognitive & motor loop

E.g. NA is more responsive to monetary rewards than cognitive rewards

  • Money –> NA activates direct pathway more (pro) and indirect less (anti)
  • Cognitive reward –> NA activates indirect pathway more (anti) and direct less (pro)
  • -> ventral striatum votes for prefered rewarding objects

E.g. VMStriatum is important during drug acquisition

  • When drug behaviour still has to be “voted for”/is not habitual
  • Enables more direct habitual behaviour later on
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

A2: Explain the pathway & functional role of PHASIC dopamine in the reward system.

A

DA system:
Medial forebrain bundle (MFB) -> VTA -> NA

Stimulation in this network leads to increased self-stimulation (with high vigor, priority over vital goals & in reference to previous experience)
-> motivation energizes, directs behaviour & enables learning

75-80% of cells signal prediction error –> best learning signal
- short-latency PHASIC bursts to unpredicted rewards & predictive cues

Learning in the BG

  • LTP in striatum depends on input & DA modulation
  • > Cortical input –> environmental info
  • > Midbrain input (DA) -> prediction error/adjustments needed
  • -> stronger transmission when both inputs occur together
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

A2: Clarify the differences between phasic & tonic DA

A

Tonic DA:
Purpose
- Modulation of motor decisions

Projection

  • SNc -> striatum -> both pathways -> thalamus
  • via ambient, sustained extracellular DA concentration

Timing

  • Constant
  • regulated by DA reuptake, control of DA synthesis/release/presynaptic influences from other NTs

Phasic DA:
Purpose
- Signalling reward prediction error

Projection

  • MFB -> VTA -> NA
  • via synaptic transmission

Timing

  • 2 timepoints
    • > Cue (reward predicting stimulus)
    • > Reward (receiving it or not)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

A3: Discuss the various properties of the phasic DA signal

A

Midbrain DA-mediated signals

  • signals pure reward value of objects independent of its specific features
  • Coded as prediction error

Reward prediction: develops from new positive reinforcers that get associated with preceding neutral stimuli, which become reward-predicting cues

Reward prediction error: difference between predicted & obtained reward

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

A3: Describe the phasic DA signal in various learning paradigms:
- Regular reward prediction

A
  • Unpredicted reward -> positive prediction error -> ventral Striatum activation
  • Fully predicted reward -> no error -> no activation
  • Predicted but no reward -> negative prediction error -> no ventral striatum activity -> depression
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

A3: Describe the phasic DA signal in various learning paradigms:
- Blocking Paradigm

A

If reward is already fully predicted, new cue will not be associated with the reward/will not elicit activity
-> it doesn’t add value

New cue -> no reward -> no error -> no DA response
New cue -> reward -> positive error -> DA activation

Old+new cue elicits same response as old cue alone

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

A3: Describe the phasic DA signal in various learning paradigms:
- Conditioned Inhibition

A

Already established cue is paired with new stimulus & NO reward is given
-> new cue becomes conditioned inhibitory cue (predicts NO reward)

Inhibitor -> no reward -> no error
Inhibitor + predictor -> no reward -> no error (despite predictor!)
Inhibitor -> reward -> positive error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

A3: Describe features of phasic DA signal:

  • graded response
  • context relation
  • range adaptation
  • time sensitivity
  • successive learning
A

Graded response: Partial error -> smaller error response

Context: Context (probability of reward) defines prediction error
-> Context (desert vs. at home) changes activation despite same reward magnitude (glass of water)

Range adaptation: Reward magnitude doesn’t define prediction error a lot
-> Large changes in reward magnitude (1€ vs. 100€) doesn’t change activation as much

Time-sensitivity: neurons also code for timing of reward
-> change of timing -> depression at old time -> positive error/activation at new timing

Learning: error signals slowly disappear over time/over successive learning trials

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

A3: Describe how phasic DA cells respond to different types of stimuli

A

Neutral stimuli -> normal reward prediction, no particular response without associated reward

Aversive stimuli -> opposite effect to rewards -> punishment conditioning
Cue -> aversive stimulus/punishment -> positive error (something appeared) -> decreased activation
-> this DA response ~5-10 times slower
-> similar to reward-absence cue
- sometimes initial or rebound activation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

A3: Describe how DA cells respond to uncertainty of rewards

A

Risk-averse people -> uncertainty reduces reward value
Risk-seeking people -> uncertainty increases reward value

Over 1/3 of DA neurons have slow/sustained/moderate activity between cue and reward (during period of uncertainty)
- highest if probability of reward = 0.5 (lowest certainty)

This activity is distinct from DA activation to rewards & cues

  • Uncertainty signal -> low DA concentrations -> stimulate high-affinity D2 receptors (tonic)
  • Reward signal -> high DA concentrations -> stimulate low-affinity D1 receptors (phasic)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

A4: Describe the range of stimuli that can elicit phasic DA

1) Primary rewards (PRs)
2) Secondary rewards (SRs)

A

1) PRs
- biological/evolutionary basis
- e.g. food, drink, sex
- positive reinforcers without any learning needed

2) SRs
- cues/associated with primary reinforcers
- e.g. light associated with juice
- Generalized conditioned reinforcers (e.g. money)
- > reinforcing in multiple contexts

-> often in reallife distinction is more gradual

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

A4: Describe the ventral striatum (vS)/NA response to Primary & Secondary reinforcers

A

vS/NA
-> responds to PRs in classical & instrumental conditioning
-> response more to SRs than PRs!
-> possibly more activated by surprising rewards
these require error/learning -> more easily detected
-> possible differences in subjective value given to SRs vs. PRs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

A4: Describe the range of stimuli that can elicit phasic DA & the response in vS/NA
3) Social rewards

A

3) Social rewards
- anywhere on PR-SR spectrum
- e.g. erotic images, smiling faces

vS/NA
-> responds to facial attractiveness/gaze/partners
-> similar activation by monetary/SRs & social rewards
-> higher activation than to PRs
-> incorporates positive social cues + behaviour in complex social tasks
(cooperation, norms, altruism; e.g. being observed during a charity donation))

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

A4: Describe the range of stimuli that can elicit phasic DA & the response in vS/NA
4) Cognitive feedback

A

4) Cognitive feedback
- Info on performance rather than reward
- vS/NA activation is differentiated depending on task type -> induce different motivation states

  • Form of social approval (extrinsic) -> can work similar to money (as generalized conditioned reinforcer)
    - > Monetary reward task –> activity correlated with extrinsic motivation
  • Guides skill acquisition (instrinsic) -> partially biological component
    - > Cognitive feedback task –> activity correlated with intrinsic motivation
17
Q

A4: Describe the range of stimuli that can elicit phasic DA & the response in vS/NA
5) Indirect learning

A

5) Indirect Learning
- Learning from observation/instruction; social learning

vS/NA only active in observer if

  • > Actions of observed person have direct implications for observer
  • > Observer and confederate are similar
  • > Indirect learning takes place by observing outcome of confederate
  • > Observing task-relevant stimulus-stimulus associations
18
Q

A4: In reward selection tasks, why is the actual reward often somewhat different from the reward promised before? And why is this not necessary in a complex cognitive task, where the reward is feedback information?
Keyword: RPE

A

vS is possibly more activated by surprising rewards because they require RPE/learning, which makes them more detectable
-> we need surprising (different to what is predicted) stimuli to measure something in striatum

Reward selection tasks

  • Primary rewards often given with 100% certainty/fully predictable
    • > no error -> no DA firing
  • If we want to measure activation, we need to change the predictability

Complex cognitive tasks (cognitive feedback)
- Reward depends on how well subject performed -> not 100% predictable anyway
Uncertainty about result -> always some prediction error -> always some neuronal activity
-> we don’t need to change the rewards because uncertainty exists in these tasks

19
Q

A5: Explain the role of phasic DA in addiction & self-stimulation as studied in Willhuhn et al (2012):
- Methods

A

Cocaine:
- slows DA reuptake in striatum (coming from midbrain DA neurons)

Methods Willhuhn et al. (2012):

1) Measured activity in VMS & DLS in rats over 3 weeks
- Nose poke into active port
- > cocaine + light + tone + 20s timeout after (only tone + light)
- Nose poke into inactive port
- > no response

2) Intra-DLS infusion of DA receptor antagonist
3) Lesion in VMS

20
Q

A5: Explain the role of phasic DA in addiction & self-stimulation as studied in Willhuhn et al (2012):
- Results

A

1) 3 week measurement
- Nose pokes
- > stable active responding, no increase
- > decreased inactive responding
- > increase ratio (active:inactive) in 2nd & 3rd week vs 1st

  • VMS: early DA release
  • > significant increase in phasic DA after active nose pokes (vs. inactive)
  • > amplitude decreases in 2nd & 3rd week
  • DLS: long-term DA release
  • > significant increase in phasic DA after active nose pokes (vs. inactive)
  • > only in 2nd & 3rd week, not 1st

2) DA receptor antagonist infusion in DLS
- increased active nose pokes at all timepoints
- > not attributable to conditioned DA signal (only late should be affected then)
- > DLS may contribute early on already
- > possible role in tonic DA rather than phasic DA

  • increased inactive nose pokes at late timepoint
  • > reversed effect on response ratio

3) VMS lesion
- no general suppression of DA transmission on DLS (as may have been expected)
- Selective effect on task-related signaling
- > VMS activity is required for developing conditioned DA signaling in DLS which regulates drug-taking responses

21
Q

A5: Summarize the role of phasic DA in addiction & self-stimulation as studied in Willhuhn et al (2012):
- Discussion

A

There is a hierarchy of DA in striatum when developing response-reward associations:
- VMS receives limbic inputs -> enables DA signaling in DLS (sensorimotor)

1) VMS
- Motivation of taking drug -> limbic loop
- phase of feedback-based learning
- decreased activation in 2nd & 3rd week

2) DLS
- Behavioural addiction to drug -> motor loop
- feedback is no longer necessary
- increase activation in 2nd & 3rd week

22
Q

A5: How do phasic & tonic DA interact in drug addiction?

A

DA antagonist –> stopped phasic DA but behaviour was affected in all weeks –> must be effect on tonic DA

High phasic DA -> high tonic DA
High tonic DA -> less phasic DA
–> explains self-facilitating addiciton spiral

23
Q

A6: Discuss the relationship between reward, motivation & habits

A

1) Motivation:
- Urge to obtain a particular goal -> wanting
- Exploring environment, begins as recreational behaviour
- VMS
- Goal –> e.g. I want to get high

2) Reward
- Individual learns which cues are related to reward
- E.g. If I see John, he can give me cocaine which will get me high

3) Habits
- After some time: cue-action relation without motivation/goal
- Exploiting environment, habitual/compulsive drug use
- DLS
- No more goal, just response to conditioned stimuli –> e.g I have to find John to get cocaine