Operant Conditioning Pt. 1 Flashcards

You may prefer our related Brainscape-certified flashcards:
1
Q

Classical vs Operant Conditioning

A
  • In classical conditioning the response occurs at the
    end of the stimulus chain
  • For example:
  • Shock → Fear
  • Tone : Shock → Fear
  • Tone → Fear
  • Study of reflexive behaviors
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Operant conditioning

A
  • Goal-oriented behavior
  • Consequences!
  • Organism learns to respond to environment in way that
    produces (+) consequences & avoids (-) ones
  • Instrumental Conditioning – response is instrumental in producing the
    consequence
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Thorndike’s Law of Effect

A
  • If a stimulus is followed
    by satisfaction, the
    response is more likely
    to occur the next time
    the stimulus is
    encountered, but an
    “unpleasant state of
    affairs” leads makes the
    response less likely
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Skinner

A
  • Didn’t like Thorndike’s emphasis of internal events
    (satisfaction, annoying)
  • Subcategories of behavior
  • Respondant – involuntary, reflexive (CC)
  • Operant – voluntary, controlled by consequences
  • Proposed that voluntary behaviors are controlled
    by their consequences (rather than by preceding
    stimuli)
  • Operant conditioning
  • The future probability of a behavior is affected by the
    consequences of the behavior
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Operant Behavior

A
  • The consequences that follow a certain response, affect
    the future probability (or strength) of the response
  • Press Lever → Food
  • Press Lever → Shock
  • Operant behaviors are emitted by the organism
    (voluntary) not elicited by stimuli (CC)
  • Voluntariness may be illusion
  • Operant behavior defined as a class of response
  • Exact response could be fast, slow, hard, soft, etc.
  • Easier to predict class of behavior than exact response
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Operant Conditioning

A
  • Reinforcer
  • Anything that increases the likelihood that a
    behavior will be repeated
  • Press Lever (R) → Food (SR
    )
  • Punisher
  • Decreases likelihood that a behavior will be repeated
  • Press Lever (R) → Shock (SP
    )
  • Reinforcement and punishment refer to
  • Process or procedure
  • Reinforcers and punishers refer to
  • Consequences
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Consequences

A
  • The consequences of the behavior can either be
  • Appetitive: a consequence that the organism wants
  • Aversive: a consequence the organism wants to avoid
  • Reinforcer – increases behavior
  • Punisher – decreases behavior
  • Positive – adding stimulus
  • Negative – removing stimulus
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Operant Conditioning

A
  • Positive reinforcers (SR+)
  • Rewards
  • Increase behavior
  • Negative reinforcers (SR-)
  • Removal of unpleasant
    stimuli
  • Increase behavior
  • Escape behavior
  • End aversive stimulus
  • Avoidance behavior
  • Before aversive stimulus
    is presented
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Punishment

A

Presentation of an
aversive stimulus
(positive) or removal
of pleasant
(negative) stimulus
* Reduces
frequency of
behavior
* Often confused
with negative
reinforcement
* Positive punishment
* Fines, shock,
spanking
* Negative punishment
* Time out, loss of
privileges

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Negative
Reinforcement
DOES NOT =
Punishment

A

Negative
Reinforcement
DOES NOT =
Punishment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Reinforcement &
Punishment

A

Positive
Reinforcement
ADD something
GOOD

Negative
Reinforcement
TAKE AWAY
something BAD

Positive
Punishment
ADD something
BAD

Negative
Punishment
TAKE AWAY
something GOOD

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Finding a reinforcer

A

a) Consumable Reinforcers: What do you like to eat or drink?
1. What do you like to eat most?
2. What do you like to drink most
b) Activity Reinforcers: What things do you like to do?
1. What do you like to do in your home or residence?
2. What do you like to do in your yard or courtyard?
3. What activities do you like to do in your neighborhood?
4. What passive activities (e.g., watching TV) do you like to do?
c) Manipulative Reinforcers: What kinds of games do you like?
d) Possessional Reinforcers: What kinds of things or objects do
you like to possess?
e) Social Reinforcers: What social rewards do you like?
1. What types of praise statements do you like to receive?
2. What type of physical contact do you enjoy (e.g., hugging)?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Operant Contingencies

A
  • Behavior modification is often more effective with
    positive reinforcement than with punishment
  • Example
  • If attempting to stop a child’s tantrums, it is better
    to positively reinforce behavior when the child is
    not misbehaving, than to punish the child when she
    is misbehaving. The attention he receives during
    the punishment might also be rewarding.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Immediate vs. delayed
reinforcement

A
  • The more immediate the reinforcer, the stronger its
    effect on behavior
  • Dickinson, Watt & Griffith (1992)
  • Rats were trained to press a lever to obtain food
  • Delayed the time between pressing lever and
    obtaining food between 2 & 64 seconds
  • An increase in the delay of food for just
    a few seconds resulted in considerably
    less responding
  • Responding almost ceased by 64
    seconds
  • Results initially interpreted as a memory
    deficit – i.e., the rats forgot which
    response produced the food
  • Subsequent studies have shown that
    rats have excellent memory. The
    problem is that the rats could not figure
    out which response produced the food
  • Reinforcement delay allowed rats time
    to engage in other behaviors
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q
  • Long-delay reinforcements
A
  • Strengthened due to rules or instructions either from ourselves or others
  • Learn delayed consequences
  • E.g., working on my essay early will probably get me a better
    grade than if I write it the night before. You won’t get the
    consequence until you get your grade
  • Example 1
  • You are potty training your puppy. When they go to the
    bathroom outside. It is best to immediately reinforce
    the behavior; delaying the reinforcement enables the
    puppy to engage in some other inappropriate behavior,
    which could inadvertently be reinforced by the delay.
  • Example 2
  • May explain why people smoke cigarettes. Short-term
    reinforcing properties (e.g., reduced anxiety) outweigh
    delayed reinforcing properties (e.g., live longer)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Primary reinforcers

A
  • Do not require special training for their properties to be
    reinforcing
  • Naturally appetitive reinforcers are those that are
    necessary for the survival of the species (e.g., food,
    water, sex)
  • Effectiveness is influenced by deprivation & satiation
  • Researchers in the 1950s accumulated evidence that
    not all primary reinforcers were necessary for survival
    (and this type of primary reinforcer was NOT influenced
    by deprivation & satiation)
  • Psychological stimulation
  • Butler (1954)
  • Monkeys placed in enclosed cage with two wooden
    doors
  • Blue door = can see experimental room for 30 seconds
    (visual stimulation)
  • Yellow door = opaque screen (no sensory stimulation)
  • Monkeys continually pushed blue door (one subject
    responded on every trial for 19 hours straight!!!)
  • Suggests that sensory stimulation is also a primary
    reinforce
17
Q

Secondary reinforcers

A

Learned by being associated with some other
reinforce
* Generalized reinforcers are associated with
multiple other reinforcers
* Examples: money, points to redeem for reward,
tickets to redeem for prizes, prestige/status, etc.

  • Wolfe (1936)
  • Trained 6 chimpanzees to place tokens (poker chips) in a
    machine (“chimp-o-mat”) to obtain grapes& bananas, etc.
  • They had to operate a heavy lever to obtain tokens
  • Wolfe found that they would work as hard to obtain tokens
    as they would to obtain direct access to grapes
  • Chimps would also hoard their chips (save them for later)
  • Blue token for 2 grapes, and white token for 1 grape –
    chimps learned to value blue tokens more
  • When tested in pairs the dominant ape would push aside
    subordinate ape to work lever; dominant ape would also
    steal tokens from subordinate ape
18
Q
  • Operant behaviors can also become reinforcing
A
  • Example
  • Aid workers who visit foreign countries in times of
    crisis (R) are praised in the media for their work
    (SR). Over time, the act of helping becomes
    reinforcing even in the absence of external praise.
19
Q
  • Strength of Backup Reinforcers
  • Variety of Backup Reinforcers
  • Simple Conditioned Reinforcer – paired with a single backup
    reinforcer
  • Generalized Conditioned Reinforcer – paired with many
    different kinds of backup reinforcers
  • Strength depends in part on number of different backup
    reinforcers available for it.
  • Schedule of Pairing with Backup Reinforcer
  • More effective if does not follow each occurrence
  • Extinction of the Conditioned Reinforcer
  • Must continue to pair conditioned reinforcer with backup
    reinforcer, at least occasionally
A
  • Strength of Backup Reinforcers
  • Variety of Backup Reinforcers
  • Simple Conditioned Reinforcer – paired with a single backup
    reinforcer
  • Generalized Conditioned Reinforcer – paired with many
    different kinds of backup reinforcers
  • Strength depends in part on number of different backup
    reinforcers available for it.
  • Schedule of Pairing with Backup Reinforcer
  • More effective if does not follow each occurrence
  • Extinction of the Conditioned Reinforcer
  • Must continue to pair conditioned reinforcer with backup
    reinforcer, at least occasionally
20
Q

Intrinsic and extrinsic
reinforcement

A
  • Intrinsic reinforcement
  • Reinforcement is provided by the act of performing the
    behavior
  • Example: do quilting because you find is
    satisfying/enjoyable
  • Extrinsic reinforcement
  • The reinforcement provided by the external
    consequences of the behavior
  • Example: Child who cleans up his room in order to
    receive praise from parents.
  • Example: Juggle balls 20 times in a row to receive 25
    cents.
21
Q
  • Natural reinforcer
A
  • Expected within certain settings
22
Q
  • Contrived (artificial) reinforcer
A
  • Arranged to modify a behavior; not typical consequence
23
Q
  • Natural reinforcers are more efficient
A
  • Contrived reinforcers used until natural ones can take over
24
Q

Theories of reinforcement

A
  • In the effort to answer the question, “What makes
    reinforcers work?”, researchers have developed
    some theories.
  • Drive reduction theory
  • The Premack principle
  • Response deprivation hypothesis
  • Behavioral bliss point approach
25
Q

Hull Drive Reduction

A
  • If you are hungry and go looking for food and eat
    some, you will feel more comfortable because the
    hunger has been reduced.
  • The desire to have the uncomfortable “hunger
    drive” reduced motivates you to seek out and eat
    the food
  • Biological needs (e.g., nutrients) lead to physiological drive
    states (e.g., hunger)
  • A stimulus acts as a reinforcer to the extent that it is
    associated with a reduction in some physiological drive (e.g.,
    hunger, thirst, sex)
  • For example, food deprivation produces a hunger-drive that
    makes the animal seek out food – when food is obtained the
    hunger is reduced
  • Example:
  • If a hungry rat in a T-maze turns right and finds food, the
    behavior of turning right is strengthened because it reduces
    the hunger-drive
  • Animals will repeat behaviors that produced stimuli that
    reduce the drive
26
Q

Incentive motivation

A
  • Sometimes, we just do things because they are
    FUN!
  • When this happens, we can say that motivation is
    coming from some property of the reinforcer itself
    rather than from some kind of internal drive
  • Examples include playing games and sports, putting
    spices on food, etc
27
Q

Premack (1965, 1971) – Premack
Principle

A

Premack (1959)
* Observed children with free access to a candy dispenser and
pinball machine to determine which behaviors were preferred
* Some children preferred playing pinball to eating candy, whereas
the reverse was true for other children
* Premack found that he could increase the likelihood of children’s
less preferred behavior by following it with the opportunity to
engage in their more preferred behavior
* For example, children who preferred candy could be conditioned
to play pinball more if the reward was candy
* Bobby, you can read those comic books once you have mowed
the grass!
* He also demonstrated that the opportunity to perform the less
desired response (e.g., pinball) did not function to reinforce the
more desired behavior (e.g., eating candy)
* Premack also suggested that behavior preferences are not
static
* Preferences are influenced by:
* Response deprivation (depriving the organism the opportunity of
making a response increases the desire of the organism to make
that response)
* Response satiation (the preference for particular response can be
decrease when the response has been allowed to occur)
* Response deprivation and response satiation experiments
have shown that low probability behaviors can sometimes
be used to reinforce high probability behaviors (e.g., Mazur,1975)
* Limitation – need to know the probabilities of two behaviors
to determine whether one can be used to reinforce the
other

28
Q

Response deprivation hypothesis

A
  • A behavior can serve a reinforcer when (1) access to the
    behavior is restricted, (2) when frequency falls below
    preferred level
  • Example:
  • A rat runs in a spinning wheel for 30 mins per day (its
    preferred duration of running). If running time is restricted
    (e.g., 10 mins per day) it is unable to reach its preferred
    duration for that activity (response deprivation). The rat will
    likely be willing to work (e.g., lever press) to obtain more
    time on the running wheel.
  • Premack Principle – frequency of one behavior relative to
    another behavior
  • Response deprivation – frequency of one behavior relative
    to its baseline
29
Q

Behavioral Bliss Point

A
  • The Response Deprivation Hypothesis makes an
    assumption that there is an optimal or best level of
    behavior that a person or animal tries to maintain
  • If you could do ANYTHING at all you wanted to do,
    how would you distribute your time?
  • This would tell you your “behavioral bliss point” for
    each activity or behavior
  • If free access to more than one behavior, the organism will
    distribute activities to maximize overall reinforcement
  • Example:
  • A rat prefers to run in a spinning wheel for 30 mins per day
    and explore a maze for 1 hour per day (optimal level of
    reinforcement).
  • If performance of one activity is contingent on performance
    of the other, the optimum distribution may be unattainable.
  • Example:
  • The rat is required to do 2 minutes of spinning wheel
    running to receive 1 minute of maze exploration. In this case
    the rat will redistribute its activities in such a way as to
    maximize overall reinforcement
  • In other words, if you can do anything you want,
    you will spend time on each thing you do in a way
    that will give you the most pleasure
  • This means that you can almost never achieve your
    “behavioral bliss point”
  • So you have to compromise by coming as close as
    you can, given your circumstances