Chapter 5- Operant Reinforcement Flashcards

1
Q

The statement that behaviour is a function of its consequences. So called because the strength of a behaviour depends on its past effects on the environment. Implicit in the law is the notion that operant learning is an active process, since it is usually the behaviour of the organism that, directly or indirectly, produces the effect

A

Law of effect

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What did Thorndike speculate about reinforcement’s neural affect? What is this view called?

A

Thorndike speculated that reinforcement strengthened the bonds or connections between neurons, a view that became known as connectionism

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How does reinforcement of a response give that response momentum? Explain Nevens use of the metaphor of momentum to describe the effects of reinforcement

A

Just as a heavy ball rolling down a hill is less likely than a light ball to be stopped by an obstruction in its path, behaviour that has been reinforced many times is more likely to persist when obstructed in someway, as for example, when one confronts a series of failures

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

In what way did Thorndike’s work depart from previous conceptions of the learning process? How did Page and Neuringer show that randomness is a reinforceable property of behavior?

A

Philosophers had long debated the role of hedonism, the tendency to seek a pleasure and avoid pain, and behavior. But Thorndike was the first person to show that behaviour is systematically strengthened or weakened by its consequences. Prior to Thorndike, learning was thought to be primarily a matter of reasoning; Thorndike shifted our attention from inside the organism to the external environment

Even the randomicity of behaviour can be modified with reinforcement. Page and Neuringer provided reinforcers to pigeons for a series of eight key pecks, but only when the series of key pecks was different from the previous 50 sequences. Under the circumstances, the key peck patterns became almost truly random

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Describe the essential components of a skinner box. How did the skinner box get its name?

A

Designed the experimental chamber so that a food magazine could automatically drop a few pellets of food into a tray. After a rat became accustomed to the noise of the action of the food magazine and readily ate food from the tray, Skinner installed a lever; when the rat press the lever, food fell into the tray. Under these conditions, the rate of lever pressing increased dramatically

Clark Hull, a psychologist at Yale university, dubbed the chamber the skinner box. Skinner preferred the term operant chamber

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Any procedure in which a behaviour becomes stronger or weaker depending on its consequences. Also called instrumental learning

A

Operant learning

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How does operant conditioning different from Pavlovian conditioning?

A

Operant learning is not S-R learning; the principal behaviour involved is not reflexive and is often complex. In operant learning the organism acts on the environment and changes it, and the change that’s produced strengthens or weakens that behavior. Whereas the organism undergoing Pavlovian conditioning may be described as passive, in operant learning the organism is necessarily active

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

The procedure of providing consequences for a behaviour that increase or maintain the strength of that behavior.

A

Reinforcement

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Name the three essential features of reinforcement

A

A behaviour must have a consequence, the behaviour must increase in strength or occur more often, the increase in strength must be the result of the consequence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

A reinforcement procedure in which a behaviour is followed by the presentation of, or an increase in the intensity of, a stimulus. Sometimes called reward training, although the term reward is problematic

A

Positive reinforcement

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

A reinforcement procedure in which a behaviour is followed by the removal of, or a decrease in the intensity of, a stimulus. Sometimes called escape training

A

Negative reinforcement

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Because what reinforces behaviour in negative reinforcement is escaping from an aversive stimulus, this procedure is also called

A

Escape training

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

An operant training procedure in which performance of a behaviour defines the end of the trial

A

Discrete trials procedure

Example: each time a cat escapes from a box, that marks the end of the trial

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

An operant training procedure in which a behaviour may be repeated any number of times

A

Free operant procedure

Example: placing a rat in an operant chamber equipped with a lever. Pressing the lever might cause a bit of food to fall into a tray, but the rat is free to return to the lever and press it again and again

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Explain why scientists often simplify problems to study them. What are the advantages and disadvantages of this approach?

A

Laboratory researchers simplify problems so they can identify functional relationships between independent and dependent variables. If the relations so identified are valid, they will enable the researcher to predict and control the phenomenon in future experiments. They will also lead to hypotheses about how real world problems maybe solved

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Compare and contrast operant and Pavlovian conditioning. Describe the parallel skinner drew between natural selection and reinforcement

A

The most important difference is that in Pavlovian conditioning one stimulus, the US, is contingent on another stimulus, the CS, whereas in operant learning, a stimulus, the reinforcing or punishing consequence, is contingent on a behavior.

Also usually involve different kinds of behavior. Pavlovian typically involves involuntary or reflexive behavior, whereas operant learning usually involves voluntary behavior.

Thorndike likened operant learning to the process of natural selection: useful behaviours “survive,” and others “die out”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Any reinforcer that is not dependent on another reinforcer for its reinforcing properties

A

Primary reinforcer or unconditioned reinforcer

Examples: food, water, sexual stimulation, weakened electrical stimulation of certain brain tissues, relief from heat and cold, and certain drugs. Are powerful but probably play a limited role in human learning and are relatively few in number

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Any reinforcer that has acquired it’s reinforcing properties through its association with other reinforcers.

A

Secondary reinforcer or conditioned reinforcer

Examples: praise, recognition, smiles, and positive feedback. These reinforcers are secondary to or are derived from other reinforcers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What for advantages do conditioned (secondary) reinforcers have over unconditioned or primary reinforcers? What key disadvantage do conditioned reinforcer’s have?

A

Primary reinforcers lose much of their reinforcing value very quickly, whereas conditioned reinforcer’s sometimes become less effective with repeated use, but this occurs much more slowly
It is often much easier to reinforce behaviour immediately with conditioned reinforcer’s than with primary reinforcers
Conditioned reinforcer’s are often less disruptive
Conditioned reinforcer’s can be used in many different situations

Main disadvantage of conditioned reinforcer’s: there effectiveness depends on their association with primary reinforcers. Primary reinforcers are much more resilient

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Any secondary reinforcer that has been paired with several different reinforcers

A

Generalized reinforcers

Example: money

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

In operant training, the procedure of reinforcing successive approximations of the desired behaviour

A

Shaping

Makes it possible to train behaviour in a few minutes that never occurs spontaneously

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What five factors are responsible for the effective use of shaping?

A
  1. Reinforce small steps
  2. Provide immediate reinforcement
  3. Provide small reinforcers
  4. Reinforce the best approximation available
  5. Back up when necessary
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Explain how adults often unwittingly shape undesirable behaviour in children

A

Tantrums are typically the products of shaping. A tired parents may give into a child’s repeated requests to “shut him up”. On the next occasion, the parent may resist giving into the child usual demands and the child might respond by becoming louder or crying. The parent yields to avoid causing a scene. On a subsequent occasion, determined to regain control, the parent may refuse to comply with the child cries or shouts, but gives in when the child produces bugle-like Wales

The parent gradually demands more and more outrageous behaviour for reinforcement, and the child obliges, eventually engaging in full-fledged tantrums

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

A series of related behaviors, the last of which produces reinforcement

A

Behaviour chain

Example: competing on the balance beam where the person must perform a number of acts in a particular sequence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Training an animal or person to perform a behaviour chain

A

Chaining

26
Q

What is the first step in a chaining procedure?

A

Breaking the task down into its component elements, a procedure called task analysis

27
Q

What are the two types of chaining procedures?

A

Backward chaining and forward chaining

28
Q

A chaining procedure in which training begins with the last link in the chain and adds preceding links in reverse order

A

Backward chaining

Example: first training a rack to drop the marble down at two, then training it to carry the marble to the tube and drop it, and then move onto the next link, and so on. The chain is never performed backward, the parts of the chain are always performed in their proper sequence but is backward only in the sense that links in the chain are added from back to front

29
Q

A chaining procedure in which training begins with the first link in the chain and adds subsequent links in order

A

Forward chaining

30
Q

What reinforces each link of a behaviour chain?

A

Each link in the chain is reinforced by the opportunity to perform the next step in the chain.

The last external reinforcer is crucial, without it, the chain is not likely to be performed

31
Q

Describe the concept of contingency and reinforcement. How and why does it influence reinforcement?

A

Where operant learning is concerned, the word contingency refers to the degree of correlation between a behaviour and its consequences. The rate at which learning occurs varies with the degree to which a behaviour is followed by a reinforcer

32
Q

Describe the concept of contiguity in reinforcement. How and why does it influence reinforcement?

A

Contiguity is the gap between a behaviour and its reinforcing consequence. Contiguity has a powerful effect on the rate of operant learning, in general, the shorter this interval is, the faster learning occurs

One reason that immediate consequences produce better results is that a delay allows time for other behaviour to occur. This behavior, and not the appropriate one, is reinforced. However, learning can occur despite reinforcement delays if the delay is regularly preceded by a particular stimulus

33
Q

How does the size of a reinforcer affect the effectiveness of a reinforcer?

A

Other things being equal, a large reinforcer is more effective than a small one. The relation between reinforcer size, sometimes referred to as a reinforcer magnitude, and learning is not, however, linear. In general, the more you increase the reinforcer size, the less benefit you get from the increase

34
Q

How do task characteristics affect the effectiveness of a reinforcer?

A

Certain qualities of the behaviour being reinforced affect the ease with which it can be strengthened. For instance, behaviour that depends on smooth muscles and glands is harder to modify through operant procedures than is behaviour that depends on skeletal muscles

35
Q

How does deprivation level affect the effectiveness of a reinforcer?

A

The effectiveness of food, water, and warmth as reinforcers varies with the extent to which an organism has been deprived of these things. In general, the greater the level of deprivation, the more effective the reinforcer

Deprivation is less important where secondary reinforcers are concerned

36
Q

How does previous learning history affect the effectiveness of a reinforcer?

A

Example: much of the difference between fast and slow learning school children disappears when both have similar learning histories.

37
Q

How does competing contingencies affect the effectiveness of a reinforcer?

A

The effects of reinforcing a behaviour will be very different if the behaviour also produces punishing consequences or if reinforcers are simultaneously available for other kinds of behaviour

38
Q

In operant training, the procedure of withholding the reinforcers that maintain a behaviour

A

Extinction

39
Q

What is the immediate effect of an extinction procedure?

A

An abrupt increase in the behaviour on extinction

40
Q

A sudden increase in the rate of behaviour during the early stages of extinction

A

Extinction burst

41
Q

What effects does operant extinction have on behavioural variability? On aggression?

A

Behavioural variability: the organism “tries something else”, often a variation of the previously reinforced behavior. We can make use of this phenomenon during shaping: after repeatedly reinforcing and approximation of the desired behavior, we can withhold reinforcement. This increases the variability of the behavior, which makes it likely that a better approximation of the goal behaviour will appear. When it does it can be reinforced

Aggression: extinction also often increases the frequency of emotional behavior. Ex. When lever pressing no longer produces food rats may bite the lever or another animal if present

42
Q

How long does it normally take to extinguish a behavior?

A

One extinction session is often not enough, even if it lasts for several hours and involves hundreds or even thousands of unreinforced acts. The longer the interval between the two extinction sessions, the greater the recovery

43
Q

Describe spontaneous recovery with respect to operant extinction. What conditions facilitate spontaneous recovery?

A

After one extinction session, the rate of the previously reinforce behaviour declines and finally stabilizes at or near its pre-training level. Extinction appears to be complete, however if the animal or person is later put back into the training situation, the extinguished behaviour occurs again. This reappearance of a previously extinguished behaviour is called spontaneous recovery

44
Q

The reappearance during extinction of a previously reinforced behaviour

A

Resurgence

Example: a pigeon is trained to peck a disk and then this behaviour is extinguished. Now some new behaviour such as wing flapping is reinforced. When this behaviour is then put on extinction, wings flapping declines, but the bird may begin to peck the disk again

45
Q

How can resurgence help to explain some instances of regression?

A

Regression is the tendency to return to more primitive, infantile modes of behaviour. When something does not produce the consequences we like, we may revert to a form of behaviour that had been reinforced in similar situations in the past. The behaviour may very well be unconscious, that is, the person probably cannot specify the learning history that produced it.

46
Q

What conditions are responsible for the rate of extinction?

A

The number of times the behaviour was reinforced before extinction, the effort the behaviour requires, and the size of the reinforcer used during training

47
Q

The author claims that reinforcement and extinction are parallel procedures, but that they do not have equal effects. Explain.

A

One nonreinforcement does not cancel out one reinforcement. Behaviour is usually acquired rapidly and extinguished slowly

48
Q

Describe Thorndike’s work in which he tried to separate the effects of reinforcement from those of practice

A

He tried to draw a four-inch line with his eyes closed over and over again for a total of 3000 attempts, yet there was no improvement. He also performed this experiment with students who without feedback did not improve. When he allowed them to open their eyes after each attempt to see the results of their effort there was a marked improvement. He concluded that practice is important only in so far as it provides the opportunity for reinforcement.

49
Q

The theory of reinforcement that attributes a reinforcers effectiveness to the reduction of a drive

A

Hull’s Drive-reduction theory

50
Q

Theory of reinforcement that considers reinforcers to be behaviours rather than stimuli and that attributes a reinforcers effectiveness to its probability relative to other behaviours

A

Premack’s relative value theory

51
Q

What are the advantages and disadvantages of Premack’s relative value theory?

A

Advantages: it is strictly empirical, no hypothetical concepts, such as Drive, are required. And event is reinforcing simply because it provides the opportunity to engage in preferred behaviour

Disadvantages: the secondary reinforcers are troublesome because his theory does not explain why the word yes, for example, is non-reinforcing.
Low probability behaviour will reinforce high probability behaviour if the participant has been prevented from performing the low probability behaviour for sometime

52
Q

The observation that high-probability behaviour reinforces low-probability behaviour

A

Premack principle

Example: if a rat shows a stronger inclination to drink then to run in and exercise wheel, drinking can be used to reinforce running. To get a drink, the rats had to run. The result was that the time spent running increased. Drinking reinforced running

53
Q

The theory of reinforcement that says a behaviour is reinforcing to the extent that the organism has been deprived, relative to its baseline frequency, of performing that behavior. Also called equilibrium theory

A

Response deprivation theory

54
Q

What are the advantages and disadvantages of response deprivation theory?

A

Disadvantages: it also has trouble explaining secondary reinforcers like the word yes

Advantages: works well enough for many reinforcers

55
Q

The author claims that negative reinforcement often starts out as escape responding and ends up as avoidance responding. Provide an original example illustrating this transition

A

Example: when a bell sounds, a cat put in a room will be blasted with cold air. After sometime the cat will move out of the way of the cold air. Eventually, just hearing the sound of the bell will make the cat move even before the cold air blows

56
Q

A form of negative reinforcement in which the subject first learns to escape, and then to avoid, an aversive

A

Escape-avoidance learning

57
Q

Explain why avoidance responding has been considered a puzzling phenomenon

A

While escaping from an aversive stimulus is reinforcing and not puzzling, performing an act that avoids the aversive stimulus is. This means that something that did not happen is a reinforcer

58
Q

The view that avoidance and punishment involved two procedures-Pavlovian and operant learning

A

Two-process theory

59
Q

What is the evidence for and against two-process theory?

A

Problems: even when the signal for shock does lose its aversiveNess, the avoidance response persists
Another problem has to do with the failure of avoidance behaviours to extinguish

60
Q

An escape-avoidance training procedure in which no stimulus regularly precedes the aversive stimulus. Also called unsignalled avoidance

A

Sidman avoidance procedure

There is no signal such as a light going off or tone, correlated with impending shock

61
Q

The view that avoidance and punishment involve only one procedure-operant learning

A

One-process theory

Whereas two-process serious say that the absence of shock could not reinforce behavior, something that does not happen cannot be a reinforcer, one-process theory save that something does happen, there is a reduction in exposure to shock, and this is reinforcing
It deals with the resistance of avoidance behaviours to extinction by getting an animal or person to stop performing an unnecessary avoidance behaviour by preventing both the behaviour and it’s aversive consequences from occurring

62
Q

Describe Thorndike’s puzzle box apparatus and his use of it to study the behaviour of cats and other animals. Why did he study animal behavior? What were his findings? What conclusions did he draw from these findings?

A

Thorndike studied animal intelligence by studying animal learning. He would place a hungry cat in a puzzle box and put food in Plainview but out of reach. The box had a door that could be opened by some simple acts such as pulling a wire loop or stepping on a treadle. The cats begin by performing a number of ineffective acts but eventually the cat would pull in the loop or step on the treadle and the door would follow open, and the cat would make its way to freedom and food. With each succeeding trial, the animal made few were ineffective movements until after many trials, it would immediately pull on the loop or step on the treadle and escape.

Concluded that a given behaviour typically has one of two kinds of consequences or effects. One kind of consequence was a satisfying state of affairs, the other was an annoying state of affairs. He later called this relationship between behaviour and its consequences as the law of effect