W8 Learning 2: Operant conditioning Flashcards
Action outcome framework
Classical and operant conditioning are experimental paradigms that have lead to highly influential frame-work for associative learning.
Classical conditioning (pavlov)
stimulus-response-associations. Involves the pairing of two stimuli. Conditioning stimulus (CS) and Unconditioned Stimulus (US), US is associated with a hardwired response. Response becomes associated with the CS-through conditioning. CS and US can be temporally segregated or overlapping.
Operant (instrumental) Conditioning
US is contingent on behaviour of animal (e.g., only occurs when a lever has been pressed), need for action. Action-outcome association (action will determine the outcomes)
It goes beyond hard-wired unconditioned responses and incorporates more complex behaviour.
Learning of action-outcome associations
‘Response’ (in operant conditioning): pressing a lever, opening a door, pushing a button etc.
Operant behaviour: under stimulus control, so that the action can be a response to a certain stimulus/situation
The outcome can be a ‘reinforcement’ or a ‘punishment’
Action => Outcome
Law of Effect
“… responses that create a typically pleasant outcome in a particular situation are more likely to occur again in a similar situation, whereas responses that produce a typically unpleasant outcome are less likely to occur again in the situation” (Thorndike, 1911)
Action is driven by reward (pleasant outcome).
Skinner Box (Operant chamber)
Allows for variety of operant conditioning paradigms.
Lights – Speakers – stimulus : generate action
Lever for responses
Food dispenser – appetitive stimuli/rewards: outcomes (reward)
Electrified grid – aversive stimuli/punishment: outcomes (punishment)
Used with rodents – very good at responding to these paradigms.
Skinner’s Terminology: Reinforcer
an event that increases the likelihood of the action
Skinner’s Terminology: Punishment
an event that decreases the likelihood of the action. (prevent you to do something again.)
Skinner’s terminology: Positive
Something has been introduced
Skinner’s Terminology: Negative
Something has been removed.
Punishment
Decreases Behaviour
Less beneficial than Reinforcement
Temporary changes in behaviour – based on coercion
Creates negative/adversarial relationship
When the person who provide punishment leaves – unwanted behaviour returns
Reinforcement
Increases Behaviour
More beneficial than punishment
More likely to result in long-term changes in behaviour
Creates positive relationship with the person providing reinforcement
Classical condition: partial reinforcement:
Classical condition: partial reinforcement: intersperse trials in which the CS is not followed by the US. This is done randomly so that the CS is followed by the US with a certain probability (here 75%). Slows down both acquisition and extinction learning.
Partial reinforcement: reinforcement schedules
responses are sometimes reinforced and sometimes not.
Slower initial learning: but greater resistance to extinction
As reinforcement does not appear after every behaviour, it takes longer for learner to determine a lack of reward. Extinction is slower.
Fixed ration
behaviour is reinforced after a specific number of responses. (e.g. giving a child a sweet after reading 5 pages of a book.)