Chapter 5- Operant Reinforcement Flashcards
The statement that behaviour is a function of its consequences. So called because the strength of a behaviour depends on its past effects on the environment. Implicit in the law is the notion that operant learning is an active process, since it is usually the behaviour of the organism that, directly or indirectly, produces the effect
Law of effect
What did Thorndike speculate about reinforcement’s neural affect? What is this view called?
Thorndike speculated that reinforcement strengthened the bonds or connections between neurons, a view that became known as connectionism
How does reinforcement of a response give that response momentum? Explain Nevens use of the metaphor of momentum to describe the effects of reinforcement
Just as a heavy ball rolling down a hill is less likely than a light ball to be stopped by an obstruction in its path, behaviour that has been reinforced many times is more likely to persist when obstructed in someway, as for example, when one confronts a series of failures
In what way did Thorndike’s work depart from previous conceptions of the learning process? How did Page and Neuringer show that randomness is a reinforceable property of behavior?
Philosophers had long debated the role of hedonism, the tendency to seek a pleasure and avoid pain, and behavior. But Thorndike was the first person to show that behaviour is systematically strengthened or weakened by its consequences. Prior to Thorndike, learning was thought to be primarily a matter of reasoning; Thorndike shifted our attention from inside the organism to the external environment
Even the randomicity of behaviour can be modified with reinforcement. Page and Neuringer provided reinforcers to pigeons for a series of eight key pecks, but only when the series of key pecks was different from the previous 50 sequences. Under the circumstances, the key peck patterns became almost truly random
Describe the essential components of a skinner box. How did the skinner box get its name?
Designed the experimental chamber so that a food magazine could automatically drop a few pellets of food into a tray. After a rat became accustomed to the noise of the action of the food magazine and readily ate food from the tray, Skinner installed a lever; when the rat press the lever, food fell into the tray. Under these conditions, the rate of lever pressing increased dramatically
Clark Hull, a psychologist at Yale university, dubbed the chamber the skinner box. Skinner preferred the term operant chamber
Any procedure in which a behaviour becomes stronger or weaker depending on its consequences. Also called instrumental learning
Operant learning
How does operant conditioning different from Pavlovian conditioning?
Operant learning is not S-R learning; the principal behaviour involved is not reflexive and is often complex. In operant learning the organism acts on the environment and changes it, and the change that’s produced strengthens or weakens that behavior. Whereas the organism undergoing Pavlovian conditioning may be described as passive, in operant learning the organism is necessarily active
The procedure of providing consequences for a behaviour that increase or maintain the strength of that behavior.
Reinforcement
Name the three essential features of reinforcement
A behaviour must have a consequence, the behaviour must increase in strength or occur more often, the increase in strength must be the result of the consequence
A reinforcement procedure in which a behaviour is followed by the presentation of, or an increase in the intensity of, a stimulus. Sometimes called reward training, although the term reward is problematic
Positive reinforcement
A reinforcement procedure in which a behaviour is followed by the removal of, or a decrease in the intensity of, a stimulus. Sometimes called escape training
Negative reinforcement
Because what reinforces behaviour in negative reinforcement is escaping from an aversive stimulus, this procedure is also called
Escape training
An operant training procedure in which performance of a behaviour defines the end of the trial
Discrete trials procedure
Example: each time a cat escapes from a box, that marks the end of the trial
An operant training procedure in which a behaviour may be repeated any number of times
Free operant procedure
Example: placing a rat in an operant chamber equipped with a lever. Pressing the lever might cause a bit of food to fall into a tray, but the rat is free to return to the lever and press it again and again
Explain why scientists often simplify problems to study them. What are the advantages and disadvantages of this approach?
Laboratory researchers simplify problems so they can identify functional relationships between independent and dependent variables. If the relations so identified are valid, they will enable the researcher to predict and control the phenomenon in future experiments. They will also lead to hypotheses about how real world problems maybe solved
Compare and contrast operant and Pavlovian conditioning. Describe the parallel skinner drew between natural selection and reinforcement
The most important difference is that in Pavlovian conditioning one stimulus, the US, is contingent on another stimulus, the CS, whereas in operant learning, a stimulus, the reinforcing or punishing consequence, is contingent on a behavior.
Also usually involve different kinds of behavior. Pavlovian typically involves involuntary or reflexive behavior, whereas operant learning usually involves voluntary behavior.
Thorndike likened operant learning to the process of natural selection: useful behaviours “survive,” and others “die out”
Any reinforcer that is not dependent on another reinforcer for its reinforcing properties
Primary reinforcer or unconditioned reinforcer
Examples: food, water, sexual stimulation, weakened electrical stimulation of certain brain tissues, relief from heat and cold, and certain drugs. Are powerful but probably play a limited role in human learning and are relatively few in number
Any reinforcer that has acquired it’s reinforcing properties through its association with other reinforcers.
Secondary reinforcer or conditioned reinforcer
Examples: praise, recognition, smiles, and positive feedback. These reinforcers are secondary to or are derived from other reinforcers
What for advantages do conditioned (secondary) reinforcers have over unconditioned or primary reinforcers? What key disadvantage do conditioned reinforcer’s have?
Primary reinforcers lose much of their reinforcing value very quickly, whereas conditioned reinforcer’s sometimes become less effective with repeated use, but this occurs much more slowly
It is often much easier to reinforce behaviour immediately with conditioned reinforcer’s than with primary reinforcers
Conditioned reinforcer’s are often less disruptive
Conditioned reinforcer’s can be used in many different situations
Main disadvantage of conditioned reinforcer’s: there effectiveness depends on their association with primary reinforcers. Primary reinforcers are much more resilient
Any secondary reinforcer that has been paired with several different reinforcers
Generalized reinforcers
Example: money
In operant training, the procedure of reinforcing successive approximations of the desired behaviour
Shaping
Makes it possible to train behaviour in a few minutes that never occurs spontaneously
What five factors are responsible for the effective use of shaping?
- Reinforce small steps
- Provide immediate reinforcement
- Provide small reinforcers
- Reinforce the best approximation available
- Back up when necessary
Explain how adults often unwittingly shape undesirable behaviour in children
Tantrums are typically the products of shaping. A tired parents may give into a child’s repeated requests to “shut him up”. On the next occasion, the parent may resist giving into the child usual demands and the child might respond by becoming louder or crying. The parent yields to avoid causing a scene. On a subsequent occasion, determined to regain control, the parent may refuse to comply with the child cries or shouts, but gives in when the child produces bugle-like Wales
The parent gradually demands more and more outrageous behaviour for reinforcement, and the child obliges, eventually engaging in full-fledged tantrums
A series of related behaviors, the last of which produces reinforcement
Behaviour chain
Example: competing on the balance beam where the person must perform a number of acts in a particular sequence