Week 6 (Chapter 47, 48) Flashcards

1
Q

What is thought to be at the centre of reinforcement learning in the mammalian brain?

A

The midbrain dopaminergic system

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the uses for a theoretical framework for reinforcement learning?

A
  • aid in the interpretation of neurophysiology data

- guide the design of future studies

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

The Rescorla-Wagner model formalized the idea that _______ is needed to drive learning

A

An error between the actual and predicted outcome

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the learning theory originally formulated for computer learning?

A

Temporal difference (TD) learning

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Reward expectation reduces dopamine reward responses in a purely _____ fashion

A

Subtractive

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

A classic adaptation of TD learning to a biological circuit model utilizes a _____

A

Complete serial compound (CSC) feature representation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Dopamine neurons show a negative prediction error ______

A

At the time of an expected reward

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Dopamine responses to reward are suppressed only at _____

A

The time of the expected reward

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What happens when the timing of a reward is shifted?

A

A larger dopamine response

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Cues followed by late rewards result in _____ dopamine responses than cues followed by early rewards

A

smaller

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What TD model was proposed to explain the shortcomings of the CSC TD model?

A

The microstimulus model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How does the microstimulus TD model differ from the CDC TD model?

A

The microstimulus model is able to account for the longer dip of dopamine responses upon reward omission

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

In 1998, Hollerman & Shultz discovered that when monkeys were given a reward stimulus earlier than expected, ______

A

There was a large dopamine response upon the reward, but no negative response at the time of the usual reward

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is one possible modification to the CSC TD model after Hollerman & Shultz’s discovery?

A

An animal is in two states - ISI when expecting a reward and ITI when not. When one is activated, the other is deactivated

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Semi-Markov dynamics imply that time spent in a state is _____ and is defined by ______

A
  • probabilistic

- a probability distribution called a ‘dwell time distribution’

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

The ______ model accounts for Hollerman & Shultz’s findings

A

Belief-state TD model

17
Q

According to the belief-state TD model, uncertainty should ______

A

Dramatically affect how reward expectation evolves over time

18
Q

The CSC TD is model-____, while the belief-state TD is model-____

A
  • Free

- Based

19
Q

What is a shortcoming of the belief-state model?

A

If an animal is hungry, food-based rewards would have higher state values than drink-based rewards - this is not explicitly learned, therefore the belief-state model does not account for it

20
Q

Dopamine neurons code for ______

A

Reward prediction error

21
Q

The magnitude of dopamine prediction error responses scales _____, integrating them into a biological teaching signal for ______

A
  • reward size, probability, and delay

- utility

22
Q

Rewards drive learning as _____

A

positive reinforcement

23
Q

What brain areas are specialized for processing rewards and reward-related behaviours?

A
  • dopaminergic midbrain
  • orbitofrontal cortex
  • amygdala
  • ventral striatum
24
Q

Dopamine neurons reside in the _____ and send signals to the ____ and _____

A
  • midbrain
  • basal ganglia
  • frontal cortex
25
Q

What do increases in dopamine neuron activity indicate?

A

The outcome was better than predicted, and the preceding behaviour should be repeated or invigorated

26
Q

What does the magnitude of dopamine activity indicte?

A

By what degree behaviours should be updated

27
Q

Dopamine teaching signals reflect the same values used for _______

A

Economic decisions

28
Q

The R-W model refers to association between _____

A

The conditioned stimulus and unconditioned stimulus

29
Q

The TD model consists of an explicit ______ that reflects ______

A
  • value function

- reward expectation through time

30
Q

Dopamine repsonses show a _____ relationship to reward amount

A

positive monotonic

31
Q

Reward responses are _____ as reward probability gets larger

A

Diminished

32
Q

What is expected utility?

A

probability multiplied by utility

33
Q

What two reward parameters are critical to separate objective factors from subjective values?

A

Timing and risk

34
Q

What is temporal discounting?

A

People tend to prefer rewards sooner rather than later

35
Q

What is the economic definition of risk?

A

Statistical variance in outcome distributions?

36
Q

Utility is defined in economics as _____-

A

Subjective value derived from choice behaviour

37
Q

What is a certainty equivalent?

A

The singular reward amount that has the same utility as a gamble

38
Q

Standard utility functions are _____, because most humans are _____

A
  • concave

- risk-averse

39
Q

Optogenic stimulation of dopamine neurons shows that ________

A

Dopamine activations teach animals what to choose