PSYC 361 MT2: OPERANT CONDITIONING (1) Operant Methods & Theories Of Rewards Flashcards
3 Key Elements in Instrumental Learning
1) Environment
2) Instrumental Behaviour
3) Consequence
Instrumental Learning Influences
- Timing of reward delivery
- Rules of reward delivery
- Type of rewards
- Other stimuli associated with rewards
Thorndike: The Law of Effect
- Devised puzzle box to study learning
- Connection formed between lever (S) & response (R) through many trials of cat pressing lever
- Learning = incremental, not insightful
The Law of Effect- SITUATION
Responses that produce satisfying effect in a situation = more likely to occur again in that situation
Responses that produce discomforting effect in a situation = less likely to occur again in that situation
The Law of Effect- STIMULUS
Response in presence of a stimulus followed by satisfying event, association between S & R STRENGTHENED
Response in presence of a stimulus followed by annoying event, association between S & R WEAKENED
Behaviourism: SKINNER
- Studied learning from a behaviourist perspective
- Coined “Operant” = OPERAtes on the environmeNT
(Instrumental conditioning & operant conditioning interchangeable)
Operant Conditioning
Reinforcement: behaviour INCREASES when it produces an APPETITIVE stimulus
Punishment: behaviour DECREASES when it produces an AVERSIVE stimulus
Skinner: stimuli as reinforcers & punishers
- Reward vs. Reinforcer: attractive & motivational property vs behaviour facilitator
Operant Conditioning: +/- CONTINGENCY
Positive: action leads to presentation of stimulus
Negative: action leads to removal of stimulus
Schedules for Reinforcement
Rules for when & how frequently reinforcers are delivered
continuous reinforcement schedule (CRF): every response = reinforcer delivery
partial reinforcement schedule (PRF): ratio, interval, fixed vs variable
Schedules for Reinforcement: PARTIAL REINFORCEMENT SCHEDULE (PRF)
Ratio Schedule: reinforcers delivered based on # of times response occurs
Interval Schedule: reinforcers delivered based on time elapsed after which response occurs
Fixed vs Variable:
- Fixed = # responses/time has to elapse is certain
- Variable = overall average known but #/time for each reinforcer delivery uncertain
Fixed vs Variable: RESPONDING PATTERNS TO SCHEDULES
VR: steady & robust responding (leads to strongest responding typically)
FR: post-reinforcement pause & ratio run
VI: steady & stable responding
FI: fixed-interval scallop
Extinction
Conditioned response diminishes due to lack of reinforcement; rate affected by previous reinforcement schedules (ie. slot machines operate on VR)
- learning process; actions no longer produce rewards
- Adaptive: saves energy by reducing unnecessary behaviour
Primary vs Secondary Reinforcers
Primary: often biologically essential (food, water)
*Secondary**: stimuli previously paired with primary reinforcer becomes reinforcing in nits own right, aka Conditioned Reinforcers (lever, clicker, voucher)
4 Different Functions of Secondary (Conditioned) Reinforcers
- Reinforcing of new learning response
- Establishing & maintaining schedules of reinforcement
- Maintaining of behaviour during extinction
- Mediating delays between response & delivery of reinforcement
Timing of Reinforcer Delivery
Temporal Contiguity: how soon reinforcer follows response
Immediate reinforcer delivery = max learning
Delays to reinforcer delivery discounts reinforcing effect