Chapter 5: Operant Conditioning Flashcards
outcomes are ________ of an animal’s behaviour
consequence
Why is instrumental behaviour emitted?
Because it is effective in producing a particular consequence
Operant conditioning?
a response defined by the effect it produces in the environment (goal- directed)
E. L. Thorndike
Puzzle box
– Hungry animal placed in box
– Food visible to animal
– Measured how long it took to escape the box on successive trials
Thorndike measured latency to get out of box on successive trials
Law of effect
Law of effect
Responses in presence of stimulus followed by satisfying events
will strengthen the association between the S and R;
If response followed by annoying event, association is weakened
Standard tasks for examining instrumental behaviour: Discrete trial procedure
– Instrumental response produced once per trial
– Each training trial ends with removal of the animal from the apparatus
ex. Operant Devices
Mazes
Inspired by observing animals in nature (burrows of rats)
measure: Running speed, Latency, Correct choices
Standard tasks for examining instrumental behaviour: Free-operant procedure
– Animals remain in apparatus and can make many responses
– No intervention by the experimenter
– Developed by BF Skinner
– Need a unit to measure behaviour:
•Operant response: defined by the effect that the response produces on the environment
– Response rates are often the behavioural measure
Standard tasks for examining instrumental behaviour: Magazine training
– How you establish the operant response
– Animal must be shown how to “use” the experimental set up
– You need a series of training steps
– Involves classical conditioning
Examining instrumental behaviour
• Shaping
– Sequence of training steps 1) Light always occurs with food delivery 2) Light and tone occur with food delivery 3) Tone occurs with food delivery • Light does not occur anymore
shaping takes advantage of response variability
Burrhus Frederick Skinner
shaping a pigeon to turn
Response Variability
Behaviour starts off highly variable
Successful variations are maintained;
unsuccessful variations are not
appetitive stimulus
pleasant outcome
aversive stimulus
unpleasant outcome
instrumental responses can
result in a stimulus
turn off a stimulus
Reinforcer
Reinforcer : An event that follows behaviour,
and that behaviour increases
Punisher:
Punisher: An event that follows behaviour,
and that behaviour decreases
Reinforcement:
increases responding
Punishment
decreases responding
positive
add something (appetitive or aversive)
negative
remove something (appetitive or aversive)
positive punishment
response-outcome contingency: response produces an aversive stimulus
result: decrease in response rate
positive reinforcement
response-outcome contingency: response produces an appetitive stimulus
Result: incerase in response rate
Negative reinforcement
response-outcome contingency: eliminates or prevents the occurrence of an aversive stimulus
result: increase in response rate
Negatove punishment
omission training
response-outcome contingency: response eliminates or prevents the occurrence of an appetitive stimulus
result: decrease in response rate
Omission training (DRO treatment) for Self-Injurious Behaviour
Reinforced with attention when not performing self-injurious behaviour
Reinforcer vs. Punisher is all based on the __________
Reinforcer vs. Punisher is all based on the
target behaviour
Instrumental Conditioning: The Players
- Instrumental response
- Outcome of response (reinforcer)
- Response-reinforcer relation
Instrumental response
• Thorndike “stamping in” of S-R association
• Skinner “reinforcement” or strengthening of
behaviour
• Language suggests that instrumental conditioning results in necessarily stereotyped behaviours
• Led to studies of Response Variability
Response Variability
Method:
– Pigeons in operant chamber
– Peck two response keys a total of 8 times
– No restriction on distribution of pecks between the two keys
– BUT, pattern of left/right pecks on a given trial had to be different that on the previous 50 trials
– Only “novel” patterns reinforced
Summary:
• Response variability can be increased by
reinforcement
• In absence of reinforcing variability in responses, responding becomes more stereotyped
Relevance or Belongingness
Relevance or Belongingness
• Remember bright-noisy/water Classical
Conditioning experiment with rats?
– (taste with sickness, audiovisual with pain)
• Belongingness - Proposed by Thorndike
– Evolutionary history makes certain responses belong
with certain reinforcers
– Pawing at door vs. yawning
The outcome or reinforcer
• (1) Quality and quantity
• (2) Previous experience with other reinforcers
(shift in quality or quantity)
– Similar to R-W
• Larger than expected reinforcer (or US) supports excitatory conditioning
• Smaller than expected reinforcer (or US) supports inhibitory conditioning
Manipulation of Quality vs. Quantity
• Independent groups of rats given flavoured water
reinforcement • More responses when reward is greater in quantity
and/or quality
Response-reinforcer relation
Temporal relation
Causal relation
Temporal relation
Time between response and
outcome
– Reinforcement is most effective if it happens
immediately after the target response
Causal relation
The extent to which the response
is necessary and sufficient for the occurrence of the reinforcer
– Reinforcement is most effective if it only happens afterthe target response
Effects of Temporal Contiguity
- Immediate reinforcement preferred
- Delays over 0.5 s can hinder performance
- Recent research suggests: can use delays up to 30 s
Effects of Delayed Reinforcement
Delay between response and food reinforcement in rats
No learning in 64-sec delay condition
How to overcome delay in reinforcement?
- Conditioned (secondary) reinforcer
2. Marking procedure
Conditioned (secondary) reinforcer
A stimulus that becomes an
effective reinforcer because of its association with a primary or unconditioned reinforcer (e.g., food)
helps overcome delay in reinforcement
Marking procedure
Instrumental response is immediately followed by a distinctive event that makes the instrumental response more memorable
helps overcome delay in reinforcement
Response–reinforcer contingency:
The relation of a response to a reinforcer
defined in terms of the probability of getting
reinforced for making the response as
compared to the probability of getting
reinforced in the absence of the response
Skinner’s Superstition Experiment
Pigeons trained with temporal-contingent
reinforcement exhibit “superstitious” behaviours
Behavioural Momentum
More than just response rates, reinforcers
strengthen the tendency to persist in
behaviour
Reinforcers: primary vs. secondary
Primary: US
Secondary: CS (i.e. money)
Reinforcers: intrinsic vs. extrinsic
inside
outside (i.e money)