Schedules of Reinforcement Flashcards

1
Q

How can someone learn without acquiring a new behaviour?

A

Sometimes learning consists of discovering when to apply a previous learned response or how many times to apply it.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Define a schedule of reinforcement

A

rule describing the delivery of reinforcement is called a reinforcement schedule

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Define schedule effects

A

pattern and rate of performance produced by a particular reinforcement schedule.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Define continuous reinforcement (CRF) schedules

A

a behaviour is reinforced every time it occurs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What reinforcement schedule provides the most rapid learning of a new behaviour?

A

CRF or continuous reinforcement

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Define an intermittent reinforcement schedule.

A

when reinforcement occurs on some occasions but not others

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Define a fixed-ratio (FR #) schedule of reinforcement

A

behaviour is reinforced a fixed number of times (# can change as the student learns)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the run rate of a reinforcement schedule?

A

rate at which the organism performs once it has resumed work after reinforcement

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Does the post-reinforcement pause affect the run rate?

A

No. The break might be long, but once the animal starts doing the action again, it might repeat it many times and quickly.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Define a variable-ratio (VR #) schedule of reinforcement

A

When the reinforcers vary around an average.

E.g., If VR 5, the reinforcer might be given every 2-10 level pulls, but always producing an average of every 5th pull.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Define a fixed-interval (FI) schedule of reinforcement.

A

Behaviour is reinforced the first time it occurs after a constant interval.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Define a variable-interval (VI) schedule of reinforcement.

A

The length of the interval during which performing is not reinforced caries around some average, such as an average of 5 seconds.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Identify the response pattern that a variable-interval schedule typically generates.

A

Produces high, steady run rates - higher than FI, but not as high as FR and VR.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Define a fixed-duration (FD) schedule of reinforcement.

A

Reinforcement is contingent on the continuous performance of a behaviour for some period of time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Define a variable-duration (VD) schedule of reinforcement.

A

The required period of performance varies around some average.
E.g., practicing an instrument

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Define a differential-reinforcement-of-low-rate (DRL) schedule

A

The learner is reinforced for going a certain period of time without performing behaviour.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Identify the response pattern that a differential-reinforcement-of-low-rate schedule typically generates.

A
  • This produces extremely low rates of behaviour.
  • It can sometimes result in the performance of a series of behaviours that aren’t relevant to reinforcement.
  • Random things that the learner does around the time of reinforcement might start being performed more frequently, sometimes a series of behaviours.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Define a differential-reinforcement-of-high-rate (DRH) schedule

A

Behaviour be performed a minimum number of times in a given period.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What reinforcement schedule produces the highest rates of behaviour?

A

DRH

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What schedules are best for dealing with problem behaviour?

A
  • DRH (increasing desirable behaviour)

- DRL (decreasing undesirable behaviour)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Define a fixed-time (FT) schedule of reinforcement

A
  • The reinforcer is delivered after a given period of time without regard to behaviour.
  • They resemble FI but there is no behaviour require for reinforcement.
22
Q

Which reinforcement schedules are examples of non-contingent reinforcement?

A

FT and VT

23
Q

Define a variable-time schedule of reinforcement.

A
  • Reinforcement is delivered periodically at irregular intervals without regard for the organism’s behaviour.
  • The only difference between VT and FT is that the VT reinforcer is delivered at intervals that vary about some average.
24
Q

What are variable-time schedules of reinforcement used for?

A

To establish superstitious behaviours

25
Q

What reinforcement schedules are common in nature?

A

VR, VI

26
Q

What is thinning of a reinforcement schedule?

A

Provide steady reinforcement and then slowly decline it (aka stretching the ratio). This is done by shaping.

27
Q

What is ratio strain?

A

When the ratio is stretched too quickly/far and the tendency to perform the behaviour breaks down.

28
Q

The density (or frequency) of a reinforcement schedule is a continuum. What extremes are on either end?

A
  • One extreme is continuous reinforcement, such as an FR schedule in which every single occurrence of a behaviour is reinforced.
  • The other extreme is extinction, which would need a schedule in which the behaviour is never reinforced.
29
Q

Define the partial-reinforcement effect (PRE)

A

The tendency of behaviour that has been maintained on an intermittent schedule is more resistant to extinction than behaviour that has been on a continuous reinforcement schedule

30
Q

How is the partial-reinforcement effect is paradoxical?

A

The law of effect implies that the unreinforced lever presses that occur during an intermittent schedule should weaken the tendency to press, not make it stronger.

31
Q

Describe the discrimination hypothesis of PRE

A

Extinction takes longer after intermittent reinforcement because it’s harder to discriminate btw extinction and an intermittent schedule than btw extinction and continuous reinforcement. This hypothesis has been empirically unsatisfactory.

32
Q

Describe the frustration hypothesis of PRE

A

nonreinforcement of previously reinforced Behaviour is frustrating for the learner and anything that reduces that frustration will be negatively reinforcing. Perhaps the frustration will be reduced by not performing the behaviour on extinction. The two behaviours then compete with one another during the extinction process (previously reinforced behaviour and the currently reinforcing behaviour). This means that pressing the button despite not being reinforced, is reinforcing in and of itself because it helps to relieve the frustration, making it slow to terminate.

33
Q

Describe the sequential hypothesis of PRE

A

The thinner the schedule, the more resistant the learner will be to extinction since a long stretch of nonreinforced behaviours has become the cue for performing the behaviour. In the past, long strings of nonreinforcement reliably preceded reinforcement.

34
Q

Describe the response unit hypothesis of PRE

A

When the responses are defined in terms of the units required for reinforcement, the total number of responses during extinction declines as the reinforcement schedule gets thinner. This demonstrates that the advantage of intermittent reinforcement disappears, because it is no longer intermittent, but a response to a unit of responses, not each response itself. This means that the PRE doesn’t truly exist. It only seems more resistant to extinction because we have failed to take into account the response units required for reinforcement. There is empirical support for this theory.

35
Q

Define a multiple (MULT) schedule of reinforcement.

A

A behaviour is under the influence of 2+ simple schedules, each associated with a particular stimulus. The schedules alternate and there is an indicator that the schedule has changed (e.g., a light changing colour).

36
Q

Define a mixed (MIX) schedule of reinforcement.

A

Same as a multiple schedule except there are no stimuli associated with the change in reinforcement contingencies. There is no clear indication that the schedule has changed.

37
Q

Define a co-operative schedule of reinforcement

A

Reinforcement is dependent on the behaviour of 2+ individuals. The reinforcement that one subject gets is partly dependent on the behaviour of another subject.

38
Q

Define a concurrent schedule of reinforcement.

A

2+ schedules available at once. The subject has a choice of which schedule it wants to participate in.

39
Q

Why are concurrent schedules of reinforcement used for studying behaviour that involves a choice?

A

They’re used to study behaviour that involves choice because that schedule type requires the subject to decide which schedule to participate in. This can be especially useful with animals because they cannot verbalize reasoning behind their choices. The main interest is in the effect that the schedules have on the choices made, not the cognitions of the animal. This becomes particularly interesting when the alternatives are both reinforced and the only difference is the relative frequency of reinforcement. Most likely, the animal will go back and forth between the choices, but eventually settle on the task with the richer reinforcement schedule.

40
Q

What is the matching law?

A

The relative frequency of each behaviour equals the relative frequency of reinforcement available; or that the distribution of behaviours matches the availability of reinforcement. This is used to describe the effort devoted to each of two reinforcement schedules in a concurrent schedule.

41
Q
  1. Why is it a poor strategy to switch back and forth between two different ratio schedules of reinforcement?
A

The subject will identify the schedule with the most reinforcement and stick with it, so they won’t participate in the other schedule at all.

42
Q

Why is it a good strategy to switch between two different interval schedules of reinforcement?

A

There will be period with less reinforcement, but the more that the animal does the task with more reinforcement, eventually that reinforcement will carry over to it performing the task with less rewards, because the task itself is still being reinforced. It removes the choice of schedule and instead encourages the animal to participate in both schedules. The animal ends up working hard on the schedule with the most reinforcement and but occasionally participate in the lesser schedule.

43
Q

How do humans make use of matching law?

A

Humans benefit from being able to match behaviours to reinforcement.
E.g., college student putting more time into a 5 credit course than a one credit course because the 5 credit one will contribute more to the student’s grade point average.

44
Q

What is the meaning behind Herrnstein’s formulas?

A

This represents that the schedule will be given attention proportionate to the reinforcers that are obtainable (aka matching law)

45
Q

Explain how and why “early wins” and “near misses” can lead to compulsive gambling

A

Early wins (receiving a win/reward/reinforcer early into the gambling period) and near misses (making it look like the gambler was close to winning, but not quite) provide a reinforcement history that can encourage compulsive gambling because the person has experienced the “feel-good” feeling and wants it again, so they keep trying despite losing more than they ever won.

46
Q

Describe the ways in which reinforcement schedules have been studied in experimental (or behavioural) economics.

A
  • Experimental/behavioural economics is a new field that draws on the parallels between research on reinforcement schedules and - It was found in rats that the more they had to work (i.e., press the lever) for luxury items (i.e., psychoactive drugs), the less they consumed, whereas for necessities like food, the consumption did not differ regardless of the amount of work involved. This effect holds true in economics when work is replaced with cost.
  • Psychiatric patients were given tokens for performing various tasks. These tokens could be exchanged for candy, cigarettes, etc. Patients were then choosing (naturally) between activities that would yield in tokens (i.e., making the bed, doing laundry) or those that would yield other reinforcers (watching television, sleeping). This is analogous to people choosing between working for pay or participating in various leisure activities. Interestingly, the top 20% yielded 41% of all tokens whereas the bottom 20% held only 7% of all tokens. This distribution of wealth is closely approximated to the general USA population at the time.
47
Q

Exxplain how Goldbert and Cheney set up an experimental analogue for malingering.

A

They tested the idea that operant behaviour associated with chronic pain may be mainteined by reinforcement after the pain has ceased. They trained rats to press a lever on an FR 45 schedule. They then switched to a cooperative schedule in which the lever needed to be pressed at least 5 times for a cumulative pressing of 50 times. This meant that one of the rats could do far less work, provided that its partner did more work. During the baseline period, both rats did a lot of work, but when one of the rats was mildly shocked, analogous to a person with chronic mild pain, there was an abrupt reduction in the amount of work done by that rat. The rat continued to work, but far less than he had during baseline. In the next stage, there was no shock, but the rat continued to work at the slower pace, only gradually increasing their share of the workload. This slower rate of work reduced the amount of food that both rats received. It took much longer, too. This study suggests that people may malinger if others are willing to pick up the slack for someone who appears to be hurting, despite everyone losing out by it.

48
Q

What criticisms have been made of research in reinforcement schedules?

A
  • schedules of reinforcement studied in the lab are artificial constructions not found in the real world (i.e., lack external validity)
  • studies produce trivial findings
  • they don’t explain much about human behaviour since most studies are done with animals (although there is evidence to suggest that both animals and people are shaped by reinforcement contingencies)
49
Q

Why is examining the effects of schedules preferable to explaining behaviour in terms of personality characteristics and states of mind?

A

It goes beyond merely naming the behaviour (i.e., laziness, compulsive gambling, addicted to cigarettes) and provides an explanation based on one’s reinforcement history

50
Q

What advantages accrue from research into schedules of reinforcement?

A
  • allowing us to answer questions that might otherwise be difficult to answer such as whether amount or frequency of reinforcement is more important, which indicates what motivates people.
  • give us a more scientific way of accounting for differences in behaviour (i.e., rooted in reinforcement history for each individual)
  • good way of testing the effects of variables on behaviour, such as those of cocaine and alcohol on performance
51
Q

How can reinforcement schedules be used as a baseline to study the effects of different independent variables on behaviour?

A
  • They generating consistent and stable patterns of performance (aka stead states) that can be compared to performance when other variables are involved.
  • Good for evaluating the effects of toxins, diet, sleep deprivation, exercise, brain stimulation and many other variables.