Operant Reinforcement Flashcards

1
Q

Describe Thorndike’s puzzle box apparatus and his use of it to study the behaviour of cats (and other animals).

A

He would place a hungry cat in the box and put food in plain view but out of reach, making the cat have to open the box door with a fire loop or stepping on a treadle. At first the cat would try a number of ineffective acts, such as trying to claw its way out. Eventually the cat would do the desired task and be let out. With each trial, the cat made fewer ineffective actions until it would immediately do the desired task.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Why did Thorndike study animal behaviour?

A

Thorndike studied animal behaviour as a way if measuring animal intelligence.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Define Thorndike’s law of effect.

A

The strength (frequency, durability, etc.) of a behaviour depends on the consequences the behaviour has had in the past; behaviour is a function of its consequence.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What did Thorndike speculate about reinforcement’s neural affect? What is this view called?

A
  • He speculated that reinforcement strengthened bonds/connections between neurons
  • this was called connectionism.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How does reinforcement of a response give that response momentum, according to Nevin?

A

Behaviour that has been reinforced many times is more likely to persist when “obstructed” in some way, such as when one confronts a series of failures.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

In what way did Thorndike’s work depart from previous conceptions of the learning process?

A

Hwas the first person to show that behaviour is systematically strengthened or weakened by its consequences.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How did Page and Neuringer show that randomness is a reinforceable property of behaviour?

A

Randomness was reinforced by the researchers by only providing reinforcement for pigeon’s key pecks that were unique from the past 50. Eventually all sequences became nearly truly random.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Describe the essential components of a Skinner box.

A

It was designed so that a food magazine would automatically drop a few pellets of food into a tray. After a rat became accustomed to the noise of the food magazine and readily ate from the tray, he installed a level so that food would fall when the rat pressed the lever. The rat’s use of the lever, thus, increased dramatically.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is operant conditioning?

A
  • Procedures whereby behaviour is strengthened or weakened by its consequences.
  • Behaviour can be said to operate on the environment.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How does operant conditioning differ from Pavlovian conditioning?

A

It is not stimulus-response learning, the principal behaviour involved is not reflexive and is often complex. The organism instead acts on the environment and changes it, and, thus, changes the strength of the behaviour

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What other names refers to operant conditioning?

A

instrumental/operant learning

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Define reinforcement.

A

Procedure of providing consequences for a behaviour that increase/maintain the strength of that behaviour

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Name the three essential features of reinforcement.

A

1) the behaviour must have a consequence
2) the behaviour must increase in strength
3) the increase in strength must be the result of the consequence.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the contingency square?

A

The contingency square is a model showing positive/negative reinforcement and positive/negative punishment relate to the addition/removal of stimuli and how they impact the strength of behaviours.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are the two basic types of reinforcement?

A

1) positive - behaviour is followed by the appearance of, or increase in the intensity of, a stimulus (called a positive reinforcer) to strengthen the behaviour that precedes it
2) negative - behaviour is strengthened by the following of the removal of, or decrease in the intensity of a stimulus (called negative reinforcer) that the individual would ordinarily avoid

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is escape training?

A

when what reinforces the behaviour in negative reinforcement is escaping an aversive stimulus

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is the difference between escape and avoidance?

A
  • In escape, an organism’s response terminates an aversive stimulus.
  • In avoidance, the response prevents or postpones a consequence.
  • E.g., putting on a pair of sunglasses (a response) terminates the glare of the sun (an aversive stimulus) that you (the organism) are experiencing (escape); or putting on the sunglasses does not terminate the glaring sun consequence, but it does prevent it from occurring (avoidance).
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Define discrete-trial procedure (Thorndike)

A

The trial ends when the participant exhibits a given behaviour.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Define free operant procedure (Skinner)

A

The behaviour may be repeated any number of times.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Explain why scientists often simplify problems to study them.

A

By simplifying a problem, the researcher can manipulate variables and reliably determine their effects - to identify functional relationships between independent and dependent variables. This enables the prediction and control of the phenomenon future experiments, and lead to hypotheses of how real-world problems may be solved. However, doing such simplification may seem unnatural and superficial. Researchers would still need to conduct field experiments on the subject to corroborate the findings that were observed in isolation (i.e., establish external validity).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Compare and contrast operant and Pavlovian conditioning.

A
  • The most important difference is that Pavlovian conditioning occurs when one stimulus (US) is contingent on another stimulus (CS), whereas in operant conditioning, a stimulus (reinforcer or punisher) is contingent on a behaviour.
  • Pavlovian conditioning typically involves reflexive behaviours whereas operant conditioning usually involves voluntary behaviour
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Can operant and Pavlovian conditioning co-occur?

A

Yes, such as what happened with Little Albert, because the loud noise followed the action of him reaching for the rat. This co-occurrence can make the distinction between types of conditioning arbitrary. It is likely that when one type of learning occurs, so does the other.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Describe the parallel Skinner drew between natural selection and reinforcement.

A

He said that reinforcement acts as natural selection for the evolution of one’s individual behaviours, in that reinforced behaviours persist or “survive” and the others are lost or “die.”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What are primary (unconditioned) reinforcers?

A

Reinforcers that are innately reinforcing and are not dependent on their association with other reinforcers.
E.g., food, water, sexual stimulation, neural stimulation, relief from hot/cold and certain drugs.

25
Q

What are secondary (conditioned) reinforcers?

A

Secondary reinforcers are dependent on their association with (or are derived from) other reinforcers, such as praise, recognition, smiles and positive feedback.

26
Q

What four advantages do secondary reinforcers have over primary reinforcers?

A

1) Primary reinforcers lose much of their reinforcing value very quickly because they are fulfilling physical needs that are easily satisfied, making them less rewarding to the individual.
2) They are easier to provide immediate reinforcement with.
3) They are less disruptive because they take less time.
4) They can be used in many different situations (not only when the individual is deprived in some way).

27
Q

What key disadvantage do secondary reinforcers have?

A

Usually secondary reinforcers are weaker than primary reinforcers because their effectiveness depends on their association with primary reinforcers

28
Q

What are generalized reinforcers?

A

Reinforcers that have been paired with many different kinds of reinforcers (can be used in a wide variety of ways)

29
Q

What is shaping?

A

It is the reinforcement of successive approximations of a desired behaviour. It makes it possible to train behaviour in a few minutes that never occurs spontaneously.

30
Q

What five factors are responsible for the effective use of shaping?

A

1) reinforce small steps (don’t ask too much of the learner)
2) provide immediate reinforcement
3) provide small reinforcers
4) reinforce the best approximation available
5) back up when necessary

31
Q

How do adults unintentionally shape undesirable behaviour in children?

A

A tired parent might give in to a child’s repeated requests to “shut him up” but this reinforces the child’s tantrum. Each time the parent demands more outrageous behaviour as the child tries new things to get the previous result

32
Q

What is a behaviour chain?

A

A connected sequence of behaviour (usually acts are done in a particular order)

33
Q

What is a chaining procedure?

A

Training an animal or person to perform a behaviour chain

34
Q

What is the first step of a chaining procedure?

A

break a task down into its components (aka task analysis).

35
Q

What reinforces each link of a behaviour chain?

A
  • any tasks that are not readily performed are first shaped.
  • each link is reinforced by the opportunity to perform the next step in the chain
  • there is a final (primary) reinforcer at the last link, and without this the chain would break down
36
Q

What are the two types of chaining procedures?

A

1) forward chaining - reinforcing performance of the first link in the chain until performed without hesitation, and repeats this when the first two links are performed, and so forth.
2) backward chaining - like forward chaining, but starting with the last link and working backwards to the procedure’s first element (but is still performed in the forward order)

37
Q

Describe the concept of contingency in reinforcement.

A

It’s the degree of correlation between a behaviour and its consequence. The degree to which learning occurs varies with the degree to which a behaviour is followed by a reinforcer. A perfect correlation between the behaviour and the consequence makes foe easy learning because it is predictable and easy to understand. Inconsistent consequences, however, make it more difficult to discern what is the correct behaviour.

38
Q

Describe the concept of contiguity in reinforcement.

A

It’s the delay between a behaviour and its reinforcing consequence, The shorter this interval, the faster that learning occurs because it does not allow time for another behaviour to occur (which could then be reinforced instead of the desired one).

39
Q

How do the following conditions affect the effectiveness of a reinforcer?:

(a) size of the reinforcer
(b) task characteristics
(c) deprivation level
(d) previous learning history
(e) competing contingencies.

A

(a) size of the reinforcer - • Smaller reinforcers given frequently produce faster learning than larger reinforcers given less frequently, but if given equally often, the larger reinforcer would have a stronger impact. The more that one increases the reinforcer size, however, the less benefit is received from the increase. Also, identifying a preference of reinforcer can improve effectiveness of the procedure.
(b) task characteristics - tasks that are overly difficult will not be strengthened as easily
(c) deprivation level - the effectiveness of food and water as reinforcers varies with the extent to which an organism has been deprived of food and water, with it being more effective when more deprived
(d) previous learning history - learning histories impact the speed of learning, with experience making it faster
(e) competing contingencies - competing contingencies means that the behaviour also prompts punishment or reinforcers are simultaneously available for other behaviours, which complicates and slows learning

40
Q

What is an operant extinction procedure?

A

withholding the consequences that reinforce a behaviour

41
Q

What is the immediate effect of an extinction procedure?

A

abrupt increase in the behaviour, aka extinction burst.

42
Q

What effects does operant extinction have on behavioral variability?

A

Extinction increases behavioural variability because the organism typically tries variations of the behaviour to regain reinforcement. This can help shaping because it encourages the organism to try new things which might be a better approximation of the goal.

43
Q

What effect does extinction have on aggression?

A

Increases aggression due to frustration of being withheld the reinforcement.

44
Q

How long does it normally take to extinguish a behaviour?

A

It time varies, but it takes many extinction sessions to extinguish behaviour, and the closer together the less likely the chance of the behaviour returning. It tends to extinguish more slowly that it was acquired.

45
Q

What is spontaneous recovery? What causes it?

A

When previously extinguished behaviour reappears. Something in the environment will trigger the behaviour to arise, and is usually reinforced, bringing it back into the individual’s routine/life.

46
Q

What is resurgence?

A

The reappearance of previously reinforced behaviour

47
Q

How can resurgence help to explain some instances of regression?

A

. Regression is the tendency for one to return to more primitive, infantile modes of behaviour. Something such as a tantrum may have had good results with a parent when a child, so when behaving properly is not being reinforced (aka on extinction), the person unconsciously reverts to a behaviour that had been reinforced in the past.

48
Q

What conditions are responsible for the rate of extinction?

A

The number of times the behaviour was reinforced before extinction, the effort the behaviour requires, and the size of the reinforcer used during training

49
Q

Reinforcement and extinction are parallel procedures, but that they do not have equal effects - Explain.

A

One non-reinforcement does not cancel out one reinforcement. One reinforcement per 60 trials can still be effective in some circumstances. This demonstrates that behaviour is acquired rapidly and extinguished slowly.

50
Q

What did Thorndike conclude which he tried to separate the effects of reinforcement and practice?

A

He concluded that practice is only important insofar as it provides the opportunity for reinforcement.

51
Q

Describe drive-reduction theory. Why is it problematic?

A
  • People behave because of motivational states called drives.
  • A reinforcer is a stimulus that reduces one or more drives (i.e., food for hunger, water for thirst, oxygen, sleep, sex, etc).
  • This theory is problematic, however, because it only accounts for primary reinforcers, not secondary reinforcers.
52
Q

Describe relative value theory.

A

reinforcers as behaviour instead of stimuli. Rather than food itself being a reinforcer, the act of eating would be. Different kinds of behaviour have different values relative to one another at any given moment, and these relative values determine the reinforcing properties of a behaviour. To assess a behaviour’s value, one should measure the amount of time a participant engages with them when given a choice (tying into preference of reinforcers)

53
Q

What are the pros and cons of relative value theory?

A
  • doesn’t depend on physiological drives or the distinction between primary and secondary reinforcers. It is strictly empirical.
  • doesn’t account for why secondary reinforcers are effective.
  • doesn’t account for how low probability behaviour will reinforce high probability behaviour if the participant has been prevented from performing the low probability behaviour for a while.
54
Q

Describe response deprivation theory of reinforcement. Why is it problematic?

A
  • Behaviour becomes reinforcing when the organism is prevented from engaging in it as its normal frequency. The access to the behaviour determines its value to the participant.
  • This theory still has issues with explaining why secondary reinforcers are effective.
55
Q

Explain why avoidance responding has been puzzling.

A

The absence of something happening is the reinforcer, but things are not happening all of the time. To say things that don’t happen explain things that do happen is a theoretical problem.

56
Q

Describe the two-process theory of avoidance,

A

Two kinds of learning experiences are involved in avoidance learning: Pavlovian and operant because the individual is responding to the environment, but also responding to a neutral stimuli associated with the unpleasant stimuli to yield fear and avoid the unpleasant stimulus entirely. According to this there is no such thing as avoidance, only escape.

57
Q

What is the evidence for and against the two process theory of avoidance?

A
  • This theory yields logical, testable predictions, but not all tests support the theory.
  • There is evidence of the CS loses its aversiveness while the avoidance behaviour persists. -
  • avoidance behaviours do not extinguish
58
Q

What is a Sidman avoidance procedure?

A

When a shock is not preceded by a tone or other signal, and the rat can delay its reception of regular shocks for 15 sec by pressing a lever. If it presses the lever a second time before the end of the period, it ears another 15 sec delay. By pressing the lever regularly, the rat can completely avoid being shocked. The rat being able to do this without a light/tone/other cue means that was no aversive stimulus to escape, and escape from an aversive stimulus is what is supposed to reinforce avoidance behaviour.

59
Q

Describe the one-process theory.

A

Avoidance involves only operant learning. Both escape and avoidance behaviours are reinforced by a reduction in aversive stimulation. The lack of extinction can be explained by the action being continually reinforced because the individual is continually avoiding the aversive stimuli (explaining how nothing can reinforce something). To extinguish the behaviour, one must stop the avoidance behaviour and its aversive consequences from occurring (explaining how extinction works with negative reinforcement).