Lecture 6 - Operant Conditioning Flashcards

1
Q

Operant condition

A

(also known as instrumental conditioning and trial-and-error learning)

is associating a voluntary
behavior (‘operation on the environment’) with an outcome.

some action the animal chooses to do is associated with an outcome

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Law of Effect

A
Animals learn that a behavior (or class of similar behaviors)
predicts a particular outcome and seek the outcome by performing a particular behavior 

Behaviors with good outcomes increase; behaviors with bad outcomes decrease.

(Thorndike, 1911)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Discrete trial paradigm

(Thorndike,
1911)

A

Cat opens the puzzle box and is reinforced with food reward.

Cat learned that flipped the switch was responsible for it getting out and getting food: So that escape behavior becomes more likely (and faster) in the future.

discrete because every-time the cat got out that was one trial and for each new trial the cat had to be put back in

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

B. F. Skinner

free-operant paradigm

A

refined Thorndike’s method to allow the animal to respond repeatedly

==> allowed the animal to control the rate of responding ==> animal controls when they get the reward (food)

• SKinner Box

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Skinner Box

A

little contraption, everything was automated: counted the number times the lever was pressed, counted the number of times the reward was provided

made it easy to measure this activity over time

instead of recording trials you’re recording behaviors over time

• Behaviors could be automatically recorded
in a Skinner box – count number of behaviors and outcomes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Acquisition

A

reinforcing behavior: giving reward for every time the rat presses the lever

the amount of responses goes up

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

extinction

A

it keeps pressing the lever but no food comes

if you stop reinforcing the behavior then the behavior starts to go away

amount of responses decreases

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Basic elements of the free-operant paradigm:

A
  • discriminative stimulus (S)
  • behavioral response (R)
  • outcome (O)

S –> R –> O

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Through repeated trials, the animal learns that the outcome is contingent upon

A

the appropriate response.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

discriminative stimulus (S)

A

that helps you select
the appropriate behavior (e.g. rat can see the lever).

the animal has to be able to ID something in the environment that it’s operating on

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

behavioral response (R

A

or class of similar responses,

is performed in response to the stimulus (e.g. rat pushes lever with either paw).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

outcome (O)

A

follows that either reinforces or punishes the behavior (e.g. rat gets food, good outcome).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

reinforcers

A

Outcomes that increase the likelihood of the behavior

primary reinforcers

secondary reinforcers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

primary reinforcers

A

meet some innate need (e.g. food, water, sleep, and sex).

Note that these are not always reinforcing (i.e.
you won’t work for water if already satiated).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Secondary reinforcers

A

have no intrinsic value, but predict or are associated with primary reinforcers (e.g. money, good grades, gold stars, etc.).

something by itself has no value but through some kind of association it’s learned that this other thing is valuable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

punishers

A

Outcomes that decrease the behavior

primary punisher

secondary punisher

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Primary punisher

A

Pain (shock), nausea, loud noises, social disapproval (?), loss of freedom (jail).

basically just aversive things

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Secondary punisher

A

Monetary fines, demerits, bad grades, etc.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

You are about to press a button on your iClicker. When
you see that you got the correct answer to the question,
that acts as a ______________.

A

Secondary reinforcer

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

positive (+) conditioning

A

If an outcome/consequence is added, if you’re given an outcome as a result of your behavior

this has nothing to do with “good” or “bad.”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

negative (-) conditioning.

A

If an outcome/consequence is removed, something is taken away

this has nothing to do with “good” or “bad.”

22
Q

Positive reinforcement

A

when you want to increase the behavior (reinforce) and you do it positively

animal rewarded for doing a behavior –> given something to make the behavior more likely

response increases (reinforcement)+ consequence is added (positive)

23
Q

Negative Reinforcement (escape/avoidance)

A

response increases (reinforcement) + consequence is removed (negative)

want the behavior to increase but take something away ==> if you do something I want, I’ll take away a “bad thing”

24
Q

Positive punishment

A

Response decreases (punishment) + Consequence is added (positive)

when you don’t want a behavior and you add something (electric shock)

25
Q

Negative punishment (omission)

A

Response decreases (punishment) + Consequence is removed (negative)

I don’t want you to do a behavior so I take something away (money, privileges, etc…)

“No more T.V. for you!”

26
Q

Positive reinforcement example

A

Eat all your vegetables –> get some dessert.

“do something I want you to do and I give you something”

27
Q

Positive punishment example

A

Scratch the couch ==> get sprayed with water;

tease your sibling ==> parental scolding.

28
Q

Negative Reinforcement example

A

Shut off the alarm clock (aversive stimulus) ==> removal of an aversive stimulus;

- arm does flailing motion
- next morning you're more likely to make that same movement
- reinforcement of behavior that takes away an aversive stimulus 

take ibuprofen ==> reduce a headache.

- next time you have a headache you're more likely to grab that medicine again
- reinforces that behavior of taking the medicine
- you're not getting anything, something is being taken away (an aversive stimulus - a headache)
29
Q

Negative punishment example

A

Commit armed robbery ==> loss of freedom (jail).

30
Q

timing and context in operant conditioning

A

are critical for forming the association.

critical for how effective it’s going to be

31
Q

If the outcome is delayed….

A

… the association is not learned as well.

So, punishing your dog for something it did an hour ago is probably not very effective…

32
Q

any kind of reinforcement to be effective needs to come

A

almost immediately

33
Q

Reinforcement schedules

A

(i.e. how often you get the outcome)

how providing an outcome, on what timing, how frequently, how reliably, how that can affect the rate at which the associations are learned.

how often and how reliably you get the outcome: going to affect the rate of learning and the effectiveness over time

34
Q

continuous reinforcement

schedule

A

When you get a reward after every behavior:

every time the rat presses the lever it gets a reward: no break in the reward: everytime you perform the action you get the outcome

35
Q

partial reinforcement

schedule

A

anything that isn’t a continuous reinforcement schedule

the outcome follows
less than 100% of the time

36
Q

variable-ratio schedule

A

A powerful form of partial reinforcement schedule

steep learning curve: if you don’t know when it’s coming you just keep banging away at the lever

you don’t get the outcome every time, but you get it about every 5 or 10 times - but you can’t predict it (unknown) – the exact timing can’t be predicted.

gambling!!! - the payout is variable

most effective and has the highest curve of learning

37
Q

fixed ratio

A

every fifth time you perform the action you get the reward

rats: 5 responses and a pause (a plateau)

38
Q

Sheldon gave Penny chocolate each time she did

something to please him. What kind of paradigm is this?

A

Positive reinforcement

39
Q

Sheldon sprayed water on Leonard when he disagreed.

What kind of paradigm is this?

A

Positive punishment

Sheldon wants Leonard to do it less (punisher) and he is adding something (positive)

something is being added to the situation and he wants him to not perform that behavior again

40
Q

Reinforcers and punishers can be

A

equally effective at
producing behavior in laboratory conditions (controlled conditions); however, punishers can experience problems in the real world.

41
Q

Problems with punishers?

in the real world when you can’t control those discriminative stimuli or timing as easily

A
  1. If you punish a behavior, you may encourage
    cheating/circumvention. (“Don’t’ speed” becomes “Don’t get caught speeding”.)
2. Concurrent reinforcement may undermine the punishment. (Student punished for talking in class may be reinforced with
approval by other students.)
  1. Punishment can lead to more variable behavior. (If a specific behavior is decreased, what replaces it?)
    • if you punish a child for jumping on the couch, then they may start jumping on the bed (doesn’t get rid of the class of behaviors)
  2. The initial intensity of the punishers needs to be fairly intense (otherwise you may get habituation).
  3. Punishment can lead to stress and anxiety, which is associated with other undesirable behaviors. (creates states that aren’t conducive for encourage the behaviors you want)
42
Q

How do animals get trained to do complex (and sometimes stupid) things?

A
  • You can’t simply reinforce a complex behavior as it may not be done accidentally.
  • Use chaining (chained learning) to create a series of reinforced behaviors that build on each other (start with something simple and keep adding one step at a time till you get something that looks much more complex)

squirrel on waterskis

S (See platform) –> R (Stand on platform) –> O (Food reward)

S (See handle) –> R (Stand on platform + place paws on handles) –> O (Food reward)

43
Q

Operant vs. classical conditioning

A

Classical conditioning
• Passive: environment works on animal.
• UStimulus evokes a
response.
• Animal learns that the CS predicts the US.
• Typically simple associations.

Operant conditioning:
• Active: animal operates on environment.
• A behavioral response
produces an outcome.
• Animal learns that behavior predicts an outcome.
• More flexible and powerful, producing more complexity.

However, the two often work together (e.g. primary and secondary reinforcers can become associated classically).

44
Q

Evaluating situations to ID what kind of conditioning or paradigm

A

is it passive or active?

what’s being associated?

the more complex a behavior the more likely it’s operant conditioning

45
Q

Brain-based models for operant conditioning

A

any instance of operant conditioning involves the interaction of several neural systems.

46
Q

Law of effect

A

origins of operant conditioning

states that animals make associations between voluntary behaviors and
contingent outcomes.

47
Q

_____ make a behavior more likely

A

reinforcers

48
Q

_____ make a behavior less likely

A

punishers

49
Q

Both reinforcers and punishers can be due to

A
intrinsic preferences
(primary) or learned associations with intrinsic preferences (secondary).
50
Q

When you add something to the outcome (give a treat or shock), that is

A

positive

51
Q

When you take away something (pain or freedom),

that is

A

negative

52
Q

_____ may not always be as effective as reinforcers in the real world, but are equally effective in the lab.

A

punishers