Week 4: Learning Flashcards by Lizzy Da Rocha Bazilio

Reinforcement learning (RL)

Anything that increases the likelihood that a response will occur, consists of learning by trial-and-error

How well did you know this?

Not at all

Perfectly

Reinforcement learning (goals)

Maximize the occurrence and consumption of rewards, minimize the occurrence and consumption of punishment

How well did you know this?

Not at all

Perfectly

Types of reward

Primary/secondary, positive/negative

How well did you know this?

Not at all

Perfectly

Timeline of reinforcement learning, conditioning, behaviourism

Law of effect (Thorndike), classical conditioning (Pavlov) instrumental behaviour and conditioning (Skinner), Exponential learning (Hull), learning rule (Rescorla-Wagner)

How well did you know this?

Not at all

Perfectly

Law of effect

Responses that produce a satisfying effect in a particular situation become more likely to occur again in that situation

How well did you know this?

Not at all

Perfectly

Law of effect (experiment)

Cat in a box, mouse maze

How well did you know this?

Not at all

Perfectly

Instrumental conditioning

Associations are formed between states and actions, outcome independent (habit)

How well did you know this?

Not at all

Perfectly

States

Stimuli or context

How well did you know this?

Not at all

Perfectly

Classical (Pavlovian) conditioning

Associations are formed between states and outcomes, trigger an unconditioned response to occur after a conditioned stimulus

How well did you know this?

Not at all

Perfectly

Unconditioned response

Reflexive behaviour, involuntary actions, innate behaviour, salvation, freezing

How well did you know this?

Not at all

Perfectly

Behaviourist paradigm

Behaviour is generated through reinforcement/conditioning (learned associations), there is no learning/behaviour without reinforcement

How well did you know this?

Not at all

Perfectly

Notion of reinforcing outcome

Reward, objective property of what would reinforce the behaviour

How well did you know this?

Not at all

Perfectly

Aim of behaviour

Maximizing the occurrence of reward by trial-and-error

How well did you know this?

Not at all

Perfectly

Properties of conditioning and reinforcement learning

Contiguity, contingency

How well did you know this?

Not at all

Perfectly

Contiguity

The reward must closely follow in time after the stimulus-response events

How well did you know this?

Not at all

Perfectly

Contingency

Study These Flashcards

The stimulus-response events must increase the probability of getting the reward

Blocking-paradigm

Study These Flashcards

There is no learning when the stimulus is completely predicted, learning and association os proportional to surprise

Rescorla-Wagner model

Study These Flashcards

Describes changes in associative strength (V) in one or several signals (CS) and the subsequent stimulus (US). Higher association strength leads to higher likelihood to trigger the UR

Error-prediction model

Study These Flashcards

Error corresponds to surprise (explains the blocking-paradigm), mismatch occurs due to prediction error

Schultz et al.

Study These Flashcards

Recorded midbrain (SN, VTA) dopamine neurons in a monkey brain during classical conditioning

Dickinson & Balleine

Study These Flashcards

Investigated goal-directed learning in rats

Model-free reinforced learning

Study These Flashcards

Habitual/Pavlovian conditioning, instrumental goal-directed behaviour

Habitual/Pavlovian conditioning (model-free learning)

Study These Flashcards

Inflexible associations between states and actions or outcomes, after learning the association is no longer dependent on the response-outcome contingency or the outcome-properties

Instrumental goal-directed behaviour (model-free learning)

Study These Flashcards

Flexible associations between states, actions and outcomes, sensitive to the outcome value

Model-based goal-directed learning

Purposeful behaviour

Purposeful behaviour (model-based learning)

Rats can learn structures and/or cognitive maps without reinforcement

Model-free goal-directed behaviour

Flexible association between states, action and outcomes, sensitive to the outcome value, forward planning

Model-based goal-directed behaviour

Model of transitions between states, action and states, this can then evaluate actions within available outcomes, backward induction

Week 4: Learning Flashcards

(28 cards)