Chapter 4 - Section 2 Operant Conditioning Flashcards

1
Q

What are operant/instrumental responses?

A

Actions operating on the world to produce some effect.

They function like instruments, or tools, to bring about some change in the environment. (P. 115)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is operant/instrumental conditioning?

A

The process by which people or other animals learn to make operant responses.

A learning process by which the effect, or consequence, of a response influences the future rate of production of that response. (P. 115)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

17: How did Thorndike train cats to escape from a puzzle box? How did this research contribute to Thorndike’s formulation of the law of effect?

A

Thorndike (1898) designed the puzzle box so that a cat placed inside could open the door by stepping on the tab that pulls down on the string hanging from the ceiling. This in turn would pull up the bolt and allow the door to fall forward.

Thorndike came to view learning as a trial-and-error process, through which and individual gradually becomes more likely to make responses that produce beneficial effects.

According to Thorndike, the stimulus situation (being inside the puzzle box) initially elicits many responses, some more strongly than others, but the satisfying consequence of the successful response (pressing the lever) causes that response to be more strongly elicited on successive trials. (P. 116)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the law of effect (Thorndike, 1898)?

A

Responses that produce a satisfying effect in a particular situation become more likely to occur again in that situation, and responses that produce a discomforting effect become less likely to occur again in that situation. (P. 116)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

18: How did Skinner’s method for studying learning differ from Thorndike’s, and why did he prefer the term ‘reinforcement’ to Thorndike’s ‘satisfaction’?

A

The advantage of Skinner’s (1938) apparatus is that the animal, after completing a response and experiencing its effect, is still in the box and free to respond again. With Thorndike’s puzzle boxes and similar apparatuses such as mazes, the animal has to be put back into the starting place at the end of each trial.

Skinner (like Watson a behaviorist)proposed the term ‘reinforcer’ as a replacement for such words as ‘satisfaction’ and ‘reward’, to refer to a stimulus change that follows a response and increases the subsequent frequency of that response. Skinner preferred this term because it makes no assumptions about anything happening in the mind; it merely refers to the effect that the presentation of the stimulus has on the animal’s subsequent behavior. (P. 117)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What does ‘operant response’ mean?

A

Skinner coined the term to refer to any behavioral act that has some effect on the environment. (P. 117)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What does ‘operant conditioning’ mean?

A

Skinner coined the term to refer to the process by which the effect of an operant response changes the likelihood of the response’s recurrence.

Thus, in a typical experiment with a Skinner box, pressing the lever is an operant response, and the increased rate of lever pressing that occurs when the response is followed by a pellet of food exemplifies operant conditioning. (P. 117)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What was the difference between Skinner’s and Wason’s believes?

A

Unlike Watson, Skinner did not consider the stimulus-response reflex to be the fundamental unit of all behavior.

Thorndike’s work provided Skinner with a model of how nonreflexive behaviors could be altered through learning. (P. 116)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

19: What is some evidence that people can be conditioned to make an operant response without awareness of the conditioning process?How is this relevant for understanding how people acquire motor skills?

A

An illustration is found in an experiment in which adults listened to music over which static was occasionally superimposed. Unbeknownst to the subjects, they could turn off the static by making an imperceptibly small twitch of the left thumb.

(…)

We constantly learn finely tuned muscle movements as we develop skill at riding a bicycle, hammering nails, making a jump shot, or any other activity. The probable reinforcers are, respectively, the steadier movement on the bicycle, the straight downward movement of the nail, and the “swoosh” when the ball goes through the hoop; but often we do not know just what we are doing differently to produce these good effects. (P. 118)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the fundamental difference between Skinner’s theory of operant conditioning and Pavlov’s theory of classical conditioning?

A

In operant conditioning the individual emits, or generates, a behavior that has some effect on the environment, whereas in classical conditioning a stimulus elicits a response from the organism. (P. 118)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

From which perspective is Skinner’s theory of learning a selectionist theory, like Darwin’s theory of evolution?

A

Animals (including humans) emit behaviors, some of which get reinforced (selected) by the environment.

Those that get reinforced increase in frequency in those environments, and those that do not get reinforced decrease in frequency in those environments.

Skinner’s adoption of selectionist thinking is a clear demonstration of the application of Darwin’s ideas to changes, not over evolutionary time, but over the lifetime of an individual. (P. 118)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

20: How can we use operant conditioning to get an animal to do something that it currently doesn’t do?

A

By a technique called (operant) shaping: successively closer approximations to the desired response are reinforced until the desired response finally occurs and can be reinforced.

Animal trainers often use the process of rewarding gradual approximations to the desired behavior. (p. 119)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

21: In what ways is extinction in operant conditioning similar to extinction in classical conditioning?

A

An operantly conditioned response declines in rate and eventually disappears if it no longer results in a reinforcer.

Rats stop pressing levers if no food pallets appear.

The absence of reinforcement of the response and the consequent decline in response rate are both referred to as ‘extinction’.

Just as in classical conditioning, extinction in operant conditioning is not true “unlearning” of the response. Passage of time following extinction can lead to spontaneous recovery of responding, and a single reinforced response following extinction can lead the individual to respond again at a rapid rate. (p. 119)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is partial reinforcement?

A

A particular response produces only sometimes a reinforcer. (p. 119)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is continuous reinforcement?

A

A response is always reinforced. (p. 119)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is extinction?

A

The response is never reinforced. (p. 119)

17
Q

22: How do the four types of partial reinforcement schedules differ from one another, and why do responses generally occur faster to ratio schedules than to interval schedules?

A
  • Fixed-ratio schedule: a reinforcer occurs after every nth response, where n is some whole number greater than 1.
  • Variable-ratio schedule: a fixed ratio schedule except that the number of responses required before reinforcement varies unpredictably around some average.
  • Fixed-interval schedule: a fixed period of time must elapse between one reinforced response and the next. Any response occurring before that time elapses is not reinforced.
  • Variable-interval schedule: like a fixed-interval schedule except that the period that must elapse before a response will be reinforced varies unpredictably around some average.

Ratio schedules (whether fixed or variable) produce reinforcers at a rate that is directly proportional to the rate of responding, so, not surprisingly, such schedules typically induce rapid responding.

18
Q

23: How do variable-ratio and variable-interval schedules produce behavior that is highly resistant to extinction?

A

Rats and humans who have been reinforced on stingy variable schedules have experienced reinforcement after long, unpredictable periods of no reinforcement, so they have learned (for better or worse) to be persistent. (p. 120)

19
Q

What does ‘reinforcement’ mean in Skinner’s terminology?

A

Any process that increases the likelihood that a particular response will occur. (p. 120)

20
Q

What does positive reinforcement (in Skinner’s terminology) mean?

A

The arrival of some stimulus following a response makes the response more likely to recur. (p. 120)

21
Q

What does negative reinforcement (in Skinner’s terminology) mean?

A

It occurs when the removal of some stimulus following a response makes the response more likely to recur.

The stimulus in this case is called a negative reinforcer.

Electric shocks, loud noises, unpleasant company, scoldings, and everything else that organisms will work to get away from can be used as negative reinforcers. (p. 120-121)

22
Q

24: How does negative reinforcement differ from positive reinforcement?

A

‘Positive’ and ‘negative’ here do not refer to the direction of change in the response rate; that increases in either case.

Rather, the terms indicate whether the reinforcing stimulus appears (positive) or disappears (negative) as a result of the operant response.

Negative reinforcement is not the same as a punishment, a mistake that many students of psychology make. (p. 121)

23
Q

25: How does punishment differ from reinforcement, and how do the two kinds of punishment parallel the two kinds of reinforcement?

A

In Skinner’s terminology, punishment is the opposite of reinforcement. It is the process through which the consequence of a response decreases the likelihood that the response will recur.

In ‘positive punishment’. the arrival of a stimulus, such as electric shock for a rat or scolding for a person, decreases the likelihood that the response will occur again.

In ‘negative punishment’, the removal of a stimulus, such as taking food away from a hungry rat or money away from a person, decreases the likelihood that the response will occur again.

Figure 4.13:
Reinforcement (whether positive such as praise or negative such as pain goes away) increases the response rate, and punishment (whether positive such as a reprimand is given or negative such as computer privileges are taken away) decreases the response rate. The terms positive and negative refer to whether the reinforcing stimulus arrives or is removed when the response is made.

The same stimuli that can service as positive reinforcers when presented can serve as negative punishers when removed, and the same stimuli that can serve as positive punishers when presented can serve as negative reinforcers when removed. (p. 121)

24
Q

26: How can an animal be trained to produce an operant response only when a specific cue is present?

A

The essence of the procedure is to reinforce the animal’s resoinse when a specific stimulus is present and to extinguish the response when the stimulus is absent.

Thus, to train a rat to respond to a tone by pressing a lever, a trainer would alternate between reinforcement periods with the tone on (during which the animal gets food pallets for responding) and extinction periods with the tone off.

The tone in this example is called ‘discriminative stimulus’. (p. 122)

25
Q

27: How was discrimination training used to demonstrate that pigeons understand the concept of a tree?

A

Pigeons that had been trained to peck whenever a photo contained a tree, or part of a tree, pecked when they saw photos like these, even though these trees are not green. The pigeons refrained from pecking when they saw a photo that did not include a tree, even if it contained green grass of leaves. (p. 123)

26
Q

28: Why might a period of reward lead to a subsequent decline in response rate when the reward is no longer available?

A

This decline is called the ‘overjustification effect’ because the reward presumably provides an unneeded extra justification for engaging in the behavior.

The result is that people come to regard the task as something that they do for an external reward rather than for its own sake - that is, as work rather than play.

When they come to regard the task as work, they stop doing it when they no longer receive a payoff for it, even though they would have therwise continued to do it for fun.

Athletic games can be great fun for children, but when the focus is on winning trophies and pleasing parents and coaches, what was previously play can become work. (p. 124)

27
Q

29: How are Skinner’s techniques of operant conditioning being used to deal with “problem” behaviors?

A

The first thing one does in behavior analysis is to define some socially significant behaviors that are in need of changing.
Then, a schedule of reinforcement is implemented to increase, decrease, or maintain the targeted behavior. (p. 125)