7.3 - Operant conditioning: Reinforcements from the Environment Flashcards
Operant
Learning in which the consequences of an organisms behavior determine whether the behavior will be repeated in the future (active behaviors)
Instrumental behaviors
Behaviors that require that require an organism to do something, solve a problem, manipulate elements of the environment
Law of effect
Behaviors followed by a satisfying state of affairs tend to be repeated; those that produce an unpleasant state of affairs are less likely to be repeated
Operant behavior
Behavior that an organism performs that has an impact on the environment
- environment responds by providing events that strengthen the behaviors (reinforce) or makes them less likely to occur (punis)
Reinforcer
Any stimulus or event that increases the likelihood of the behavior that led to it
Punisher
any stimulus or event that decreases the likelihood of the behavior that led to it
Positive vs negative
Does not mean good vs bad
positive reinforcements
stimulus is presented; increases likelihood of behavior
Negative reinforcements
stimulus is removed; increases likelihood of behaviors
Positive punishment
Stimulus is administered; decreases likelihood of behavior
Negative punishment
Stimulus is removed; decreases likelihood of behavior
If I present food and a behavior continues this is…
positive reinforcement
2) If I turn on an electric shock and the behavior stops
this is positive punishment
3) If I turn off an electric shock and the behavior continues this
negative reinforcements
4) If I remove food and a behavior stops this is ….
Negative punishments
Why is reinforcement better than punishment in learning desired behavior
Punishment signals bad behavior but does not promote learning about the desired behavior
Primary reinforcer
Satisfy biological needs
Secondary reinforcers
Associated with primary reinforces through CO; e.g. money, police lights
Effectiveness of the reinforcer or punisher
- The amount of time between the occurrence of a behavior and the reinforcer/punisher; the more time that passes, the less effective the reinforcer/punisher
Learning takes place in contexts Three-term contingency
In the presence of a discriminative stimulus (e.g. classmates in a Starbucks drinking coffee), a response (e.g. joking comments about prof) produces a reinforcer (e.g. laughter among classmates)
What about extinction
Different than in a classical conditioning because it depends on how often reinforcement is received
Schedules of reinforcements are important
Interval schedules: time interval between reinforcements
Ratio schedules: Ratio of responses to reinforcements
Fixed-Interval
Reinforcers presented at fixed-time periods if appropriate response has been given; e.g. every 2 minutes; studying only right before exam
Variable-Interval
Behavior is reinforced based on the average amount of time since last reinforcement; e.g. average out to winning once an hour, but not the same time within each hour
Fixed ratio
Reinforcement is delivered after a specific number of responses has been made; e.g. every 20th response
Variable ratio
Delivery of responses is based on an average number of responses; e.g. slot machines - pay out every 100 pulls on average, but could be on the 3rd pull or the 80th pull
Continuous reinforcement
Reinforcement after each response
Intermittent reinforcement
Only some of the responses are followed by a reinforcement
Intermittent reinforcement effect
Intermittently-reinforced behaviors are more resistant to extinction; e.g. a slot machine
Shaping
Learning that results from the reinforcement of little steps to the final desired behavior; rewarding successive approximation