CHP 8 PRIMARY AND CONDITIONED REINFORCEMENT AND SHAPING Flashcards
Primary reinforcers
a consequence that functions as a reinforcer because it’s important in sustaining the life of the individual or the continuation of the species (helps us survive)
Conditioned reinforcers
consequences that function as reinforcers only after learning occurs (Pavlovian learning)
Why is Pavlovian learning essential to operant conditioning?
Pavlovian learning is necessary to transform a neutral consequence into a conditioned reinforcement
Two kinds of contingent reinforcement learning
Pavlovian learning and verbal learning
Token economy
created by Maxine Stitzer
a set of rules governing the deliver of response-contingent conditioned reinforcers (tokens, points, etc.) that may be later exchanged for 1+ backup reinforcers
Pros of token economies
1) motivationally robust (motivation to earn tokens remains fairly constant)
2) nondisruptive (reinforcing ongoing behaviors with a token is less disruptive than a reinforcer that disrupts the performance)
3) fair compensation (easy to assign different values to different behaviors)
4) portability (tokens are portable and inc the probability that appropriate behavior will be reinforced)
5) delay-bridging (tokens given immediately bridge the delay/gap between good behavior and the backup reinforcer)
Backup reinforcer
the reinforcer that is provided after the conditioned reinforcer signals the delay reduction to its delivery
Principles of Effective Conditioned Reinforcers
1: Use an effective backup reinforcer (either a highly valued one, or use a token that can be exchanged for lots of different backup reinforcers)
2: Use a salient (noticeable) conditioned reinforcer
3: Use a conditioned reinforcer that signals a large delay reduction to the backup reinforcer
4: Make sure the conditioned reinforcer is not redundant
Generalized conditioned reinforcer
a conditioned reinforcer that signals a delay reduction to more than one backup reinforcer
Clickers and marking
Clickers use a clicking sound (a marker) that is engaged immediately following the behavior so the target behavior is more easily recognizable
Shaping
differential reinforcement of successive approximations to a terminal behavior (end behavior)
Flow
a state in which one feels immersed in a rewarding activity and in which we lose track of time and self
3 response-reinforcer contingencies for flow to occur
1) “Goldilocks zone”
2) “proximal goals”
3) immediate task-relevant consequences
6 principles of effective shaping
1) objectively define the terminal behavior
2) determine along which dimension the learner’s current behavior falls short of the terminal behavior
3) when mapping out the sequence of successive approximations, ensure that each one is neither too easy nor too difficult
4) differential reinforcement: reinforce the current response approximation and extinguish everything else, including old response approximations
5) be sure the learner has mastered each response approximation before advancing to the next one
6) if the next approximation proves too difficult (extinction occurs), lower the reinforcement criterion until responding is earning reinforcers again
Percentile schedule of reinforcement
simple automated training technique incorporating the 6 principles of effective shaping (make approximation #2 60% harder than the 1st, the following approximations are calculated by discarding the oldest data points and including the new data points)