Goal-oriented behavior Consequences! Organism learns to respond to environment in way that produces (+) consequences & avoids (-) ones Instrumental Conditioning – response is instrumental in producing the consequence

Didn’t like Thorndike’s emphasis of internal events (satisfaction, annoying) Subcategories of behavior Respondant – involuntary, reflexive (CC) Operant – voluntary, controlled by consequences Proposed that voluntary behaviors are controlled by their consequences (rather than by preceding stimuli) Operant conditioning The future probability of a behavior is affected by the consequences of the behavior

The consequences that follow a certain response, affect the future probability (or strength) of the response Press Lever → Food Press Lever → Shock Operant behaviors are emitted by the organism (voluntary) not elicited by stimuli (CC) Voluntariness may be illusion Operant behavior defined as a class of response Exact response could be fast, slow, hard, soft, etc. Easier to predict class of behavior than exact response

Reinforcer Anything that increases the likelihood that a behavior will be repeated Press Lever (R) → Food (SR ) Punisher Decreases likelihood that a behavior will be repeated Press Lever (R) → Shock (SP ) Reinforcement and punishment refer to Process or procedure Reinforcers and punishers refer to Consequences

The consequences of the behavior can either be Appetitive: a consequence that the organism wants Aversive: a consequence the organism wants to avoid Reinforcer – increases behavior Punisher – decreases behavior Positive – adding stimulus Negative – removing stimulus

Positive reinforcers (SR+) Rewards Increase behavior Negative reinforcers (SR-) Removal of unpleasant stimuli Increase behavior Escape behavior End aversive stimulus Avoidance behavior Before aversive stimulus is presented

Long-delay reinforcements

Strengthened due to rules or instructions either from ourselves or others Learn delayed consequences E.g., working on my essay early will probably get me a better grade than if I write it the night before. You won’t get the consequence until you get your grade Example 1 You are potty training your puppy. When they go to the bathroom outside. It is best to immediately reinforce the behavior; delaying the reinforcement enables the puppy to engage in some other inappropriate behavior, which could inadvertently be reinforced by the delay. Example 2 May explain why people smoke cigarettes. Short-term reinforcing properties (e.g., reduced anxiety) outweigh delayed reinforcing properties (e.g., live longer)

Operant Conditioning Pt. 1 Flashcards by Damien lybrand

Classical vs Operant Conditioning

In classical conditioning the response occurs at the
end of the stimulus chain
For example:
Shock → Fear
Tone : Shock → Fear
Tone → Fear
Study of reflexive behaviors

How well did you know this?

Not at all

Perfectly

Operant conditioning

Goal-oriented behavior
Consequences!
Organism learns to respond to environment in way that
produces (+) consequences & avoids (-) ones
Instrumental Conditioning – response is instrumental in producing the
consequence

How well did you know this?

Not at all

Perfectly

Thorndike’s Law of Effect

If a stimulus is followed
by satisfaction, the
response is more likely
to occur the next time
the stimulus is
encountered, but an
“unpleasant state of
affairs” leads makes the
response less likely

How well did you know this?

Not at all

Perfectly

Skinner

Didn’t like Thorndike’s emphasis of internal events
(satisfaction, annoying)
Subcategories of behavior
Respondant – involuntary, reflexive (CC)
Operant – voluntary, controlled by consequences
Proposed that voluntary behaviors are controlled
by their consequences (rather than by preceding
stimuli)
Operant conditioning
The future probability of a behavior is affected by the
consequences of the behavior

How well did you know this?

Not at all

Perfectly

Operant Behavior

The consequences that follow a certain response, affect
the future probability (or strength) of the response
Press Lever → Food
Press Lever → Shock
Operant behaviors are emitted by the organism
(voluntary) not elicited by stimuli (CC)
Voluntariness may be illusion
Operant behavior defined as a class of response
Exact response could be fast, slow, hard, soft, etc.
Easier to predict class of behavior than exact response

How well did you know this?

Not at all

Perfectly

Operant Conditioning

Reinforcer
Anything that increases the likelihood that a
behavior will be repeated
Press Lever (R) → Food (SR
)
Punisher
Decreases likelihood that a behavior will be repeated
Press Lever (R) → Shock (SP
)
Reinforcement and punishment refer to
Process or procedure
Reinforcers and punishers refer to
Consequences

How well did you know this?

Not at all

Perfectly

Consequences

The consequences of the behavior can either be
Appetitive: a consequence that the organism wants
Aversive: a consequence the organism wants to avoid
Reinforcer – increases behavior
Punisher – decreases behavior
Positive – adding stimulus
Negative – removing stimulus

How well did you know this?

Not at all

Perfectly

Operant Conditioning

Positive reinforcers (SR+)
Rewards
Increase behavior
Negative reinforcers (SR-)
Removal of unpleasant
stimuli
Increase behavior
Escape behavior
End aversive stimulus
Avoidance behavior
Before aversive stimulus
is presented

How well did you know this?

Not at all

Perfectly

Punishment

Presentation of an
aversive stimulus
(positive) or removal
of pleasant
(negative) stimulus
* Reduces
frequency of
behavior
* Often confused
with negative
reinforcement
* Positive punishment
* Fines, shock,
spanking
* Negative punishment
* Time out, loss of
privileges

How well did you know this?

Not at all

Perfectly

Negative
Reinforcement
DOES NOT =
Punishment

How well did you know this?

Not at all

Perfectly

Reinforcement &
Punishment

Positive
Reinforcement
ADD something
GOOD

Negative
Reinforcement
TAKE AWAY
something BAD

Positive
Punishment
ADD something
BAD

Negative
Punishment
TAKE AWAY
something GOOD

How well did you know this?

Not at all

Perfectly

Finding a reinforcer

a) Consumable Reinforcers: What do you like to eat or drink?
1. What do you like to eat most?
2. What do you like to drink most
b) Activity Reinforcers: What things do you like to do?
1. What do you like to do in your home or residence?
2. What do you like to do in your yard or courtyard?
3. What activities do you like to do in your neighborhood?
4. What passive activities (e.g., watching TV) do you like to do?
c) Manipulative Reinforcers: What kinds of games do you like?
d) Possessional Reinforcers: What kinds of things or objects do
you like to possess?
e) Social Reinforcers: What social rewards do you like?
1. What types of praise statements do you like to receive?
2. What type of physical contact do you enjoy (e.g., hugging)?

How well did you know this?

Not at all

Perfectly

Operant Contingencies

Behavior modification is often more effective with
positive reinforcement than with punishment
Example
If attempting to stop a child’s tantrums, it is better
to positively reinforce behavior when the child is
not misbehaving, than to punish the child when she
is misbehaving. The attention he receives during
the punishment might also be rewarding.

How well did you know this?

Not at all

Perfectly

Immediate vs. delayed
reinforcement

The more immediate the reinforcer, the stronger its
effect on behavior
Dickinson, Watt & Griffith (1992)
Rats were trained to press a lever to obtain food
Delayed the time between pressing lever and
obtaining food between 2 & 64 seconds
An increase in the delay of food for just
a few seconds resulted in considerably
less responding
Responding almost ceased by 64
seconds
Results initially interpreted as a memory
deficit – i.e., the rats forgot which
response produced the food
Subsequent studies have shown that
rats have excellent memory. The
problem is that the rats could not figure
out which response produced the food
Reinforcement delay allowed rats time
to engage in other behaviors

How well did you know this?

Not at all

Perfectly

Long-delay reinforcements

Strengthened due to rules or instructions either from ourselves or others
Learn delayed consequences
E.g., working on my essay early will probably get me a better
grade than if I write it the night before. You won’t get the
consequence until you get your grade
Example 1
You are potty training your puppy. When they go to the
bathroom outside. It is best to immediately reinforce
the behavior; delaying the reinforcement enables the
puppy to engage in some other inappropriate behavior,
which could inadvertently be reinforced by the delay.
Example 2
May explain why people smoke cigarettes. Short-term
reinforcing properties (e.g., reduced anxiety) outweigh
delayed reinforcing properties (e.g., live longer)

How well did you know this?

Not at all

Perfectly

Primary reinforcers

Study These Flashcards

Do not require special training for their properties to be
reinforcing
Naturally appetitive reinforcers are those that are
necessary for the survival of the species (e.g., food,
water, sex)
Effectiveness is influenced by deprivation & satiation
Researchers in the 1950s accumulated evidence that
not all primary reinforcers were necessary for survival
(and this type of primary reinforcer was NOT influenced
by deprivation & satiation)
Psychological stimulation
Butler (1954)
Monkeys placed in enclosed cage with two wooden
doors
Blue door = can see experimental room for 30 seconds
(visual stimulation)
Yellow door = opaque screen (no sensory stimulation)
Monkeys continually pushed blue door (one subject
responded on every trial for 19 hours straight!!!)
Suggests that sensory stimulation is also a primary
reinforce

Secondary reinforcers

Study These Flashcards

Learned by being associated with some other
reinforce
* Generalized reinforcers are associated with
multiple other reinforcers
* Examples: money, points to redeem for reward,
tickets to redeem for prizes, prestige/status, etc.

Wolfe (1936)
Trained 6 chimpanzees to place tokens (poker chips) in a
machine (“chimp-o-mat”) to obtain grapes& bananas, etc.
They had to operate a heavy lever to obtain tokens
Wolfe found that they would work as hard to obtain tokens
as they would to obtain direct access to grapes
Chimps would also hoard their chips (save them for later)
Blue token for 2 grapes, and white token for 1 grape –
chimps learned to value blue tokens more
When tested in pairs the dominant ape would push aside
subordinate ape to work lever; dominant ape would also
steal tokens from subordinate ape

Operant behaviors can also become reinforcing

Study These Flashcards

Example
Aid workers who visit foreign countries in times of
crisis (R) are praised in the media for their work
(SR). Over time, the act of helping becomes
reinforcing even in the absence of external praise.

Strength of Backup Reinforcers
Variety of Backup Reinforcers
Simple Conditioned Reinforcer – paired with a single backup
reinforcer
Generalized Conditioned Reinforcer – paired with many
different kinds of backup reinforcers
Strength depends in part on number of different backup
reinforcers available for it.
Schedule of Pairing with Backup Reinforcer
More effective if does not follow each occurrence
Extinction of the Conditioned Reinforcer
Must continue to pair conditioned reinforcer with backup
reinforcer, at least occasionally

Study These Flashcards

Strength of Backup Reinforcers
Variety of Backup Reinforcers
Simple Conditioned Reinforcer – paired with a single backup
reinforcer
Generalized Conditioned Reinforcer – paired with many
different kinds of backup reinforcers
Strength depends in part on number of different backup
reinforcers available for it.
Schedule of Pairing with Backup Reinforcer
More effective if does not follow each occurrence
Extinction of the Conditioned Reinforcer
Must continue to pair conditioned reinforcer with backup
reinforcer, at least occasionally

Intrinsic and extrinsic
reinforcement

Study These Flashcards

Intrinsic reinforcement
Reinforcement is provided by the act of performing the
behavior
Example: do quilting because you find is
satisfying/enjoyable
Extrinsic reinforcement
The reinforcement provided by the external
consequences of the behavior
Example: Child who cleans up his room in order to
receive praise from parents.
Example: Juggle balls 20 times in a row to receive 25
cents.

Natural reinforcer

Study These Flashcards

Expected within certain settings

Contrived (artificial) reinforcer

Study These Flashcards

Arranged to modify a behavior; not typical consequence

Natural reinforcers are more efficient

Study These Flashcards

Contrived reinforcers used until natural ones can take over

Theories of reinforcement

Study These Flashcards

In the effort to answer the question, “What makes
reinforcers work?”, researchers have developed
some theories.
Drive reduction theory
The Premack principle
Response deprivation hypothesis
Behavioral bliss point approach

Hull Drive Reduction

* If you are hungry and go looking for food and eat some, you will feel more comfortable because the hunger has been reduced. * The desire to have the uncomfortable “hunger drive” reduced motivates you to seek out and eat the food * Biological needs (e.g., nutrients) lead to physiological drive states (e.g., hunger) * A stimulus acts as a reinforcer to the extent that it is associated with a reduction in some physiological drive (e.g., hunger, thirst, sex) * For example, food deprivation produces a hunger-drive that makes the animal seek out food – when food is obtained the hunger is reduced * Example: * If a hungry rat in a T-maze turns right and finds food, the behavior of turning right is strengthened because it reduces the hunger-drive * Animals will repeat behaviors that produced stimuli that reduce the drive

Incentive motivation

* Sometimes, we just do things because they are FUN! * When this happens, we can say that motivation is coming from some property of the reinforcer itself rather than from some kind of internal drive * Examples include playing games and sports, putting spices on food, etc

Premack (1965, 1971) – Premack Principle

Premack (1959) * Observed children with free access to a candy dispenser and pinball machine to determine which behaviors were preferred * Some children preferred playing pinball to eating candy, whereas the reverse was true for other children * Premack found that he could increase the likelihood of children’s less preferred behavior by following it with the opportunity to engage in their more preferred behavior * For example, children who preferred candy could be conditioned to play pinball more if the reward was candy * Bobby, you can read those comic books once you have mowed the grass! * He also demonstrated that the opportunity to perform the less desired response (e.g., pinball) did not function to reinforce the more desired behavior (e.g., eating candy) * Premack also suggested that behavior preferences are not static * Preferences are influenced by: * Response deprivation (depriving the organism the opportunity of making a response increases the desire of the organism to make that response) * Response satiation (the preference for particular response can be decrease when the response has been allowed to occur) * Response deprivation and response satiation experiments have shown that low probability behaviors can sometimes be used to reinforce high probability behaviors (e.g., Mazur,1975) * Limitation – need to know the probabilities of two behaviors to determine whether one can be used to reinforce the other

Response deprivation hypothesis

* A behavior can serve a reinforcer when (1) access to the behavior is restricted, (2) when frequency falls below preferred level * Example: * A rat runs in a spinning wheel for 30 mins per day (its preferred duration of running). If running time is restricted (e.g., 10 mins per day) it is unable to reach its preferred duration for that activity (response deprivation). The rat will likely be willing to work (e.g., lever press) to obtain more time on the running wheel. * Premack Principle – frequency of one behavior relative to another behavior * Response deprivation – frequency of one behavior relative to its baseline

Behavioral Bliss Point

* The Response Deprivation Hypothesis makes an assumption that there is an optimal or best level of behavior that a person or animal tries to maintain * If you could do ANYTHING at all you wanted to do, how would you distribute your time? * This would tell you your “behavioral bliss point” for each activity or behavior * If free access to more than one behavior, the organism will distribute activities to maximize overall reinforcement * Example: * A rat prefers to run in a spinning wheel for 30 mins per day and explore a maze for 1 hour per day (optimal level of reinforcement). * If performance of one activity is contingent on performance of the other, the optimum distribution may be unattainable. * Example: * The rat is required to do 2 minutes of spinning wheel running to receive 1 minute of maze exploration. In this case the rat will redistribute its activities in such a way as to maximize overall reinforcement * In other words, if you can do anything you want, you will spend time on each thing you do in a way that will give you the most pleasure * This means that you can almost never achieve your “behavioral bliss point” * So you have to compromise by coming as close as you can, given your circumstances

Operant Conditioning Pt. 1 Flashcards

(29 cards)