Week 4: Learning Flashcards

1
Q

Reinforcement learning (RL)

A

Anything that increases the likelihood that a response will occur, consists of learning by trial-and-error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Reinforcement learning (goals)

A

Maximize the occurrence and consumption of rewards, minimize the occurrence and consumption of punishment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Types of reward

A

Primary/secondary, positive/negative

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Timeline of reinforcement learning, conditioning, behaviourism

A

Law of effect (Thorndike), classical conditioning (Pavlov) instrumental behaviour and conditioning (Skinner), Exponential learning (Hull), learning rule (Rescorla-Wagner)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Law of effect

A

Responses that produce a satisfying effect in a particular situation become more likely to occur again in that situation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Law of effect (experiment)

A

Cat in a box, mouse maze

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Instrumental conditioning

A

Associations are formed between states and actions, outcome independent (habit)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

States

A

Stimuli or context

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Classical (Pavlovian) conditioning

A

Associations are formed between states and outcomes, trigger an unconditioned response to occur after a conditioned stimulus

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Unconditioned response

A

Reflexive behaviour, involuntary actions, innate behaviour, salvation, freezing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Behaviourist paradigm

A

Behaviour is generated through reinforcement/conditioning (learned associations), there is no learning/behaviour without reinforcement

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Notion of reinforcing outcome

A

Reward, objective property of what would reinforce the behaviour

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Aim of behaviour

A

Maximizing the occurrence of reward by trial-and-error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Properties of conditioning and reinforcement learning

A

Contiguity, contingency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Contiguity

A

The reward must closely follow in time after the stimulus-response events

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Contingency

A

The stimulus-response events must increase the probability of getting the reward

17
Q

Blocking-paradigm

A

There is no learning when the stimulus is completely predicted, learning and association os proportional to surprise

18
Q

Rescorla-Wagner model

A

Describes changes in associative strength (V) in one or several signals (CS) and the subsequent stimulus (US). Higher association strength leads to higher likelihood to trigger the UR

19
Q

Error-prediction model

A

Error corresponds to surprise (explains the blocking-paradigm), mismatch occurs due to prediction error

20
Q

Schultz et al.

A

Recorded midbrain (SN, VTA) dopamine neurons in a monkey brain during classical conditioning

21
Q

Dickinson & Balleine

A

Investigated goal-directed learning in rats

22
Q

Model-free reinforced learning

A

Habitual/Pavlovian conditioning, instrumental goal-directed behaviour

23
Q

Habitual/Pavlovian conditioning (model-free learning)

A

Inflexible associations between states and actions or outcomes, after learning the association is no longer dependent on the response-outcome contingency or the outcome-properties

24
Q

Instrumental goal-directed behaviour (model-free learning)

A

Flexible associations between states, actions and outcomes, sensitive to the outcome value

25
Q

Model-based goal-directed learning

A

Purposeful behaviour

26
Q

Purposeful behaviour (model-based learning)

A

Rats can learn structures and/or cognitive maps without reinforcement

27
Q

Model-free goal-directed behaviour

A

Flexible association between states, action and outcomes, sensitive to the outcome value, forward planning

28
Q

Model-based goal-directed behaviour

A

Model of transitions between states, action and states, this can then evaluate actions within available outcomes, backward induction