Chapter 10- Scheduled Of reinforcement Flashcards
A rule describing the delivery of reinforcers for a behaviour
Schedule of reinforcement
The distinctive rate and pattern of responding associated with a particular reinforcement schedule
Schedule effects
A reinforcement schedule in which a behaviour is reinforced each time it occurs
Continuous reinforcement CRF
Example: a rat receives food every time it presses a lever
Any of several reinforcement schedules in which a behaviour is sometimes reinforced
Intermittent schedule
Also called partial reinforcement
A reinforcement schedule in which every nth performance of a behaviour is reinforced
A behaviour is reinforced when it occurred a fixed Number of times
Fixed ratio schedule FR
Example: every third lever press is reinforced
What response pattern does a six-ratio schedule typically generate
Animals on fixed ratio schedules perform at a high rate, often punctuated by short fathers after reinforcement
A pause in responding following reinforcement; associated primarily with FI and FR schedules
Post-reinforcement pauses
The rate at which a behaviour occurs once it has resumed following reinforcement
Run rate
Increasing the ratio of lever presses to reinforcers does not change how rapidly a rat presses once it has resumed lever pressing, but does increase the length of the breaks the rat takes after each reinforcement
A reinforcement schedule in which, on average, every nth performance of a behaviour is reinforced
Variable ratio schedule, VR
Example: in a VR5 schedule, reinforcement may occur after 2 to 10 lever presses, but the overall average will be one reinforcement for every five presses
Most gambling games are based on variable reinforcement schedules in which the payoffs occur after a variable and unpredictable number of responses
What response pattern does a variable ratio schedule typically generate?
Produces steady performance at run rates similar to comparable FR schedules. If post-reinforcement pauses occur, they usually appear less often and are for shorter periods then in a comparable FR schedule. Post reinforcement causes are strongly influenced by the size of the average ratio and by the lowest ratio
A reinforcement schedule in which a behaviour is reinforced the first time it occurs following a specified interval since the last reinforcement
Fixed interval schedule, FI
Example: a pigeon on an FI 5” schedule will have food delivered into its tray the first time it pecks the disk, but for the next five seconds, disc pecking produces no reinforcement. Then, at the end of the five second interval, the very next disc peck is reinforced
Example: baking in the oven, waiting for a bus, studying
Identify the response pattern that a fixed interval schedule typically generates
Produces post reinforcement pauses and a scalloped-shaped cumulative record.
A reinforcement schedule in which a behaviour is reinforced the first time it occurs following an interval since the last reinforcement, with the interval varying around a specified average
Variable interval schedule, VI
In a VI 5” schedule, the average interval between reinforced pecks is five seconds.
Examples, human hunters lying in wait for game.
Identify the response pattern that a variable interval schedule typically generates
Produce high, steady run rates, higher than FI schedules, but not so high as comparable FR and VR schedules
A reinforcement schedule in which reinforcement is contingent on the continuous performance of a behaviour for a fixed period of time
Fixed duration schedule, FD
Example: the child who is required to practice playing a piano for half an hour, at the end of practice, and provided the child has practised the entire time, he receives a reinforcer such as a cookie
A reinforcement schedule in which reinforcement is contingent on the continuous performance of a Behaviour for a period of time, with the length of time varying around an average
Variable duration, VD
Example: the child who is practising piano, any given session might and after 30 minutes, 55 minutes, 20 minutes, or 10 minutes. On average, the students will practice for half an hour before receiving the milk and cookies, but there is no telling when the reinforcers will appear
A reinforcement schedule in which a behaviour is reinforced only if a specified period of time has elapsed since the last performance of that behaviour
Differential reinforcement of low rate, DRL
Example: a rat might receive food for pressing a lever, but only if five seconds have elapsed since the last lever press. The interval begins each time the behaviour is performed.
In a DRL 5” schedule, a pigeon that pecks a disk receives reinforcement only if five seconds have elapsed since the last disc peck
Identify the response pattern that a differential-reinforcement-of-low-rate schedule typically generates
Produces extremely low rates of behavior. Sometimes results in the performance of a series of behaviours that are quite a relevant to reinforcement, and this behaviour may be superstitious
A form of differential reinforcement in which a behaviour is reinforced only if it occurs at least a specified number of times in a given.
Differential reinforcement of high rate, DRH
Example: a pigeon might be required to pack a disc five times in a 10 second period. If it text fewer than five times during that period, it receives nothing
Identify the response pattern that a differential reinforcement of high rate schedule typically generates
Can produce extremely high rates of behavior, higher than any other schedule
A reinforcement schedule in which reinforcement is delivered independently of behaviour at fixed intervals
Fixed time schedule, FT
Example: in an FI 10” schedule, a pigeon may receive food after a 10 second interval, but only if it pecks a disk. Whereas in an FT 10” schedule, the pigeon receives food every 10 seconds whether it pecks the disk or not
Not common outside the laboratory. Unemployment compensation and welfare payments come close to meeting the definition of fixed time schedules
A reinforcement schedule in which reinforcement is delivered at varying intervals regardless of what the organism does
Variable time schedule, VT
The only difference between VT schedules and FT schedules is that in VT schedules the reinforcer is delivered at intervals that vary about some average
Have been used to establish superstitious behaviour. Example, periodically a sportfishermen gets lucky and the use of a particular lure or kind of bait or method of casting is coincidentally reinforced
The procedure of gradually increasing the number of responses required for reinforcement
Stretching the ratio
Also referred to as thinning the schedule
With the procedure known as stretching the ratio, successive approximations of the desired behaviour are reinforced, it is essentially the same shaping process used to shape any new behaviour
The experimenter gradually sheets persistence in this way
Example: when card sharks and pool hustlers sometimes let their competitors win frequently during the early stages of play and then gradually win more and more of the games
Disruption of the pattern of responding due to stretching the ratio of reinforcement too abruptly or too far
Ratio strain
Example: workers who complain about being overworked and under paid and who shirk their responsibilities
The density or frequency of a reinforcement schedule is a continuum. What extremes are on either end?
At one extreme we find continuous reinforcement, an FR schedule in which every single occurrence of a behaviour is reinforced. At the other extreme we find extinction, a schedule in which a behaviour is never reinforced
The tendency of a behaviour to be more resistant to extinction following partial reinforcement than following continuous reinforcement.
Partial reinforcement affect or PRE
In a rat lever pressing study, the thinner the reinforcement schedule before extinction, the greater number of lever presses during extinction
Why does the author state that the partial-reinforcement effect is paradoxical?
It is paradoxical since the law of effect implies that the unreinforced lever presses that occur during an intermittent schedule should weaken the tendency to press, not make it stronger.
The proposal that the PRE occurs because it is harder to discriminate between intermittent reinforcement and extinction than between continuous reinforcement and extinction
The discrimination hypothesis
Example: it takes longer to discriminate between extinction and NFR 30 schedule then it does to discriminate between extinction and FR1 schedule
The proposal that the PRE occurs because non-reinforcement is frustrating and during intermittent reinforcement frustration becomes an S+ for responding
Frustration hypothesis
The thinner the reinforcement schedule during training, the higher is the level of frustration when the rat finally receives food
The proposal that the PRE occurs because the sequence of reinforced and non-reinforced behaviours during intermittent reinforcement becomes an S+ for responding during extinction
Sequential hypothesis
Attributes the PRE to differences in the sequence of cues during training
Extinction proceeds rapidly after continuous reinforcement because an important cue for performing is missing
The thinner the reinforcement schedule, the more resistant the rat will be to extinction, since a long stretch of non-reinforced lever pressing has become the queue for continued pressing. In other words, the rat performs in the absence of reinforcement because, in the past, long strings of non-reinforced presses have reliably preceded reinforcement
The proposal that the PRE is due to differences in the definition of a behaviour during intermittent and continuous reinforcement
Response unit hypothesis
In an FR 2 schedule, where one press does nothing, but two presses produces food, we should not think of this as press-failure, press-reward, but rather as press-press-reward. The unit of behaviour being reinforced is two lever presses.
When responses are defined in terms of the units required for reinforcement, the total number of responses during extinction declines as the reinforcement schedule gets thinner. Behaviour on intermittent reinforcement only seems to be more resistant to extinction because we have failed to take into account the response units required for reinforcement
A complex reinforcement schedule in which two or more simple schedules alternate, with each schedule associated with a particular stimulus
Multiple schedule
Abbreviation: MULTI in front of the different schedules
Example: a pigeon is reinforced for packing on an FI 10” schedule when a red light is on, but on AVR 10 schedule when a yellow light is on. The two reinforcement schedules alternate, with the changes indicated by changes in the colour of the light
A complex reinforcement schedule in which two or more simple schedules, neither associated with a particular stimulus, alternate
Mixed schedule
MIX FI 10” VR 10 schedule: disc pecking might be reinforced on an FI 10” schedule for 30 seconds and then on VR 10 schedule for 60 seconds, but there is no clear indication that the schedule has changed
A complex reinforcement schedule that consists of a series of simple schedules, each of which is associated with a particular stimulus, with reinforcement delivered only on completion of the last schedule in the series
Chain schedule
A complex reinforcement schedule that consists of a series of simple schedules, with reinforcement delivered only on completion of the last schedule in the series. The simple schedules are not associated with different stimuli
Tandem schedule
A complex reinforcement schedule in which reinforcement is contingent on the behaviour of two or more individuals
Cooperative schedule
Two pigeons receive food by picking a disk when the two of them have packed a total of 20 times. One might pack the disc at the rate of 10 times a minute while the other packs at 40 times a minute. As soon as the total number of packs reaches 20, they each receive a few pieces of green
A complex reinforcement schedule in which two or more simple schedules are available at the same time
Concurrent schedule
A pigeon may have the option of picking a read disc on a VR 50 schedule, or packing a yellow disc on a VR 20 schedule. The concurrent schedule involves a choice
The principle that, given the opportunity to respond on two or more reinforcement schedules, the rate of responding on each schedule will match the reinforcement available on each schedule
Matching law
Why is it a poor strategy to switch back-and-forth between two different ratio schedules of reinforcement? Or, why is it a good strategy to stay with one ratio schedule?
It makes sense to identify the more reinforcing schedule as quickly as possible and remain loyal to it. Switching back-and-forth between two ratio schedules is pointless. You should discriminate between which schedule is more dense
Why is it a good strategy to switch between two different interval schedules of reinforcement?
There will be periods during which lever pressing is useless. Some of this time could be spent pressing the lever on the other schedule. It therefore makes sense for the animal to devote most of its effort to the more dense schedule but occasionally press the lever on the thinner schedule
What is Herrnstein’s formula that predicts choice in a two-choice situation? Describe each of the terms in the equation, and explain the overall meaning of the equation
Ba/Ba + Bb = rA/rA + rB
Ba and Bb represent two behaviors, behaviour A and behaviour B, and rA and rB represent The reinforcement rates for behaviours A and B, respectively.
This equation is merely a reformulation of the matching law
What is Herrnstein’s formula that predicts choice in a multiple-choice situation? Describe each of the terms in the equation, and explain the overall meaning of the equation
Ba/Ba + Bo = rA/rA + rO
Ba represents the particular behaviour we are studying, and Bo represents all other behaviors, rA represents the reinforcers available for Ba, and rO represents the reinforcers available for all other behaviours
This formula has less predictive value then the formula for the two-choice situation, because it is not possible to specify all the behaviours that may occur, nor all the reinforcers those acts may produce
Reminds us that behaviour is a function of the reinforcers available for any behaviour that might occur, not merely the reinforcers available for the behaviour that interests us at the moment
Cite and recognize original examples of the matching law describing human behaviour
A farmer may devote most available farmland to a crop that produces a nice profit under typical weather conditions, and planting a smaller area in a less profitable crop that does well under adverse weather conditions.
When we spend more time at a high-paying job then at a low-paying one
When college students devote more time to a five credit course then to a one credit course, since a high-grade in the former pays better than a high-grade in the latter
What type of reinforcement schedule maintains gambling behavior? Explain how and why “early wins” and “near misses” can lead to compulsive gambling. If you were the owner of a gambling enterprise, how could you use this knowledge to make more money?
The payoff in most games of chance resembles variable ratio schedules of reinforcement, and such schedules can produce high rates of behaviour that are highly resistant to change
Momentary variations in schedules that lead to early wins and near misses can lead to compulsive gambling because the person is reinforced early on
If I owned a gambling enterprise, I would put my machines on a variable ratio schedule, and have people win during their first few bets, and after that have them almost win every once in a while
The use of reinforcement schedules, among other techniques, to study economic principles
Experimental or behavioural economics
Describe the ways in which reinforcement schedules have been studied in experimental or behavioural economics
Economists know that when the price of a luck Sharee item rises, the consumption of that item declines. But when the price of an essential item, such as food, rises, there is little change in consumption. The same phenomenon has been demonstrated in rats. Rats will work for psychoactive drugs, a luxury, but increases in the price of the drug, the number of lever presses required for a dose, usually results in decreased consumption; yet large increases in the price of food, an essential, do not lower consumption substantially
Psychiatric patients who earn tokens for performing various tasks and exchanges them for cigarets and other items are given a choice between activities for which tokens are available (ex. Doing laundry), and other activities for which reinforcers are available (watching tv). The distribution of the tokens among patients resembled the United States population. Those in the top 20% held a total of 41% of all tokens, while those in the bottom 20% held only 7%.
Explain how Goldberg and Cheney set up an experimental analogue for malingering. What implications does this animal study have for human behavior?
Tested the idea that operant behaviour associated with chronic pain may be maintained by reinforcement after the pain has ceased. Pairs of rats on a cooperative schedule where one was exposed to a mild electric shock resulted in an abrupt reduction in the amount of work done by the rat in pain. These rats continue to press the lever, but at a much lower rate than during baseline.
After the shock was terminated, the rats continue to work at a lower pace, only gradually increasing their share of the work load. This is interesting because the slower rate of work reduced the amount of food both rats received. Although the partner read could take up the slack, it necessarily took longer to reach the 50 lever press is required for reinforcement when one rat did little Work
Suggests that there is good reason to believe that people made malinger if others are willing to press the lever more often to make up for someone who appears to be hurting. Malingering may occur even though everyone, including the malingerer, loses by it
What criticisms have been made of research in reinforcement schedules?
Argue that the schedules of reinforcement studied in the laboratory are artificial constructions not found in the real world
Complain that schedules research generally produces trivial findings
Complain that reinforcement schedules reveal considerably more about rats and pigeons then they do about people.
Why is examining the effects of schedules preferable to explaining behaviour in terms of personality characteristics and state of mind?
These kinds of explanations merely name the behavior to be explained. Identifying the kinds of reinforcement schedules that produce these behaviours is a considerable advance
What other advantages accrue from research into schedules of reinforcement
When the goal is to discover rules that describe the way the environment affects behaviour it is difficult if not impossible to discover such rules unless the experimenter simplifies the environment as experimenters do with research into schedules of reinforcement
Allows us to answer questions that might be difficult to answer otherwise
Gives us a more scientific way of accounting for differences in behaviour
The only reason that studies with humans sometimes reveal patterns of behaviour different from those obtained with animals may be because human subjects often receive instructions about what they are to do which have a powerful effect on human behaviour
Provides a very good way of testing the effects of variables on behaviour
How can reinforcement schedules be used as a baseline to study the effects of different independent variables on behavior?
Example: researchers trained rats to press a lever to get access to an exercise wheel and later administered cocaine in varying amounts 10 minutes prior to running the rest. The cocaine had no detectible influence on the pattern of behaviour until the researchers reached a dosage of 16 mg per kilogramme of body weight. Add to this level, the scalloped pattern of the FI schedule began to deteriorate. In much the same way, researchers used schedules as a basis for comparing the effects of alcohol and cocaine on human performance. Schedule performance can provide a baseline for evaluating the effects of toxins, diet, sleep deprivation, exercise, brain stimulation, and many other variables
Provide and recognize original examples of learning in which no new behaviour is acquired
Example: an increase in the rate of behavior.
A pigeon that turns in counterclockwise circles at the rate of three or four a minute may learn to make 10 or 15 turns a minute
Example: a reduction in the rate of behaviour
A bird that turns counterclockwise 10 times a minute can learn to make one turn a minute
Example: a change in the pattern of performance as well as the rate
The cook learns to avoid opening the oven in the first few minutes but to check on the cookies more and more often during the last few minutes of baking time