8.3 Flashcards
Operant conditioning
association between a behaviour and its consequences
Distinguish between operant conditioning and classical conditioning
- OC = behaviour and consequences, whereas CC = association between 2 stimuli
- OC generally works best for VOLUNTARY behaviours, CC for INVOLUNTARY behaviours
Conditioned reinforcer: how does the reinforcer gain value?
a reinforcer that gains value from its association with other items of value
What are the two types of consequences?
reinforcement: increase tendency of the goal behaviour occurring again
- positive
- negative
punishment: decrease the tendency that a behaviour will occur again
- positive
- negative
*usually a goal. e.g. promote safe driving
Positive reinforcement
something added to increase the tendency the behaviour will occur again
=> rewarded with a gift card
Negative reinforcement
something taken away to increase the tendency the behaviour will occur again
=> seat belt will take away the buzzing sound
Positive punishment
something added to decrease tendency that a behaviour will occur again (e.g. speeding)
=> add a fine for speeding ticket
Negative punishment
something taken away to decrease tendency that a behaviour will occur again
=> take away their license
Extinction OP vs CC
OP: the behaviour is no longer associated with a reward or punishment
CC: new learning overriding the old learning
Partial reinforcement
reinforcement of a DESIRED behaviour on some occasions, but not others
Fixed interval
- requires waiting a specific amount of time, e.g. salary once a month
- no matter the effort, still being paid at the end of the month
=> response rates start out low, but increases as getting near the time for reinforcement = scalloped response pattern
Fixed ratio (FR)
- reinforcement after fixed quantity (not time)
=> produce high rates of responding, but declines immediately after the reinforcement is received (post-reinforcement pause)
Variable interval
- reinforcement in unpredictable amount of time
=> steady response rate
Variable ratio
- unspecified amount of times to receive reinforcement
(slot machine)
=> high and steady response rate
Shaping (method of successive approximations): what do the reinforcers do?
reinforcers guide behaviour towards closer and closer approximations of the desired behaviour
- balance between too little and too much reinforcement
Shaping: learner does some ___ of ____. They receive ____ for that action.
sort of action.
Consequence
Shaping: procedure uses only ______
reinforcement
=> rewards
=> reinforce behaviours u want to see w reward, but ignore the one u don’t want (no punishment)
Shaping: in order to shape a behaviour, u need to start somewhere. u raise the bar each time and _____ until desired behavior
nudge them closer each time until desired behaviour => training
a tiger to jump thru a fire hoop (unnatural behaviour)
- reward for stepping thru hoop on the ground
- only reward jumping thru the hoop slightly above ground
- only reward jumping thru the hoop w a bit of fire
and so on… a little more complicated each time