Chapter 5: Operant Conditioning Flashcards

1
Q

Define operant conditioning.

A

The process where an organism learns to make or to refrain from making certain responses in order to obtain or avoid outcomes (outcomes depend on the responses).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Define positive reinforcement, positive punishment, negative reinforcement and negative punishment.

A

Reinforcement: behaviour increases
Punishment: behaviour decreases
Negative: Taking away
Positive: Adding something

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Describe the R (Response) part of the S-R-O arc.

A

The response is a voluntary process which becomes like a reflex over time. If a normal motor program is blocked, the animal will generalize the motion and use other methods to achieve the same ends.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What experiment showed generalizability of the response in OC?

A

Rats trained to wade through a maze partially filled with water, later with the maze flooded, rats swam to the goal no problem.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the Law of Effect?

A

When an animal’s behavior is followed by a positive outcome, then the likelihood of the animal performing the behavior again increases.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What were the methodological problems with Thorndike’s Puzzle Box? (3)

A
  1. Have to repeat trials over and over, having to reset the animals and device
  2. Experimenter may either add a reward or punishment unconsciously.
  3. The experimenter decides when the trial is complete (experimenter bias)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What did B.F Skinner say were problems with comparing animals across trials of Thorndike-like experiments? (4)

A
  1. What is one animals is just slower, does it mean it’s worse?
  2. What is counted as the worst performance?
  3. Time to R decreases with learning, we get progressively worse at discriminating time differences.
  4. How do you generate a prediction from latencies? What if the animal is a masochist.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are the advantages of The Skinner Box? (2)

A
  1. Experimenter does not have to chase the escaping animal

2. Self-trials (animal dictates their own rate of response)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What was Skinner’s Box?

A
  1. Sd: light that signals that box is “on”
  2. R: rate of lever pressing
  3. Outcomes(O): Food delivery (reinforcement), shock through floor wires (punishment)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the progression of operant conditioning?

A
  1. Pre-training: low spontaneous rate of R (exploring stage)
  2. Training: contingency is introduced (If S then R-> O)
  3. Acquisition: animal discovers contingency, rate of R increases
  4. Extinction: contingency is eliminated, rate of R decreases.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the four characteristics of Operant Conditioning?

A
  1. Animal operates on the environment
  2. Stimulus evokes a response to produce an outcome
  3. Animal connects context, behavior and outcome
  4. Operant conditioning is more flexible/powerful
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are the three characteristics of CC?

A
  1. Environment operates on animal
  2. Stimulus evokes a response
  3. Animal learns that a CS predicts a US
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Describe shaping.

A

Pigeon turn experiment with B.F. Skinner, evokes a behaviour through successive approximations which build up complex R incrementally. Initially contingency is introduced for simple behaviour, as rate of R improves, contingency is moved to a more complex version of R, gradually builds a complex R animal would never spontaneously produce

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Describe chaining.

A

Builds complex R sequences by linking together S, R, O conditions. Initially, train animals to pick up objects. Next, reward for picking up and then throwing it. Allows SERIES of behaviour (as opposed to shaping, which elaborates on a single response).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Describe the process of backwards chaining.

A

Outcome first, then more complicated steps. (sometimes easier than chaining)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Talk about the mine-sniffing giant pouched rats.

A

Mozambique rats are better than dogs at sniffing out TNT, partially blind so depend more on smell and hearing.
Trained to smell TNT
Most take 1 year to train (some elite can train in 8 months) by shaping. Clickers are often used for training animals (conditioning, pair a clicker with the food), clickers are secondary reinforcers (initially no value, but eventually learned that they predict primary reinforcers)
Better than feeding with hand (hand is not instantaneous)

17
Q

What did Clark Hull propose as a motivational source for OC? What’s a criticism?

A

He proposed the drive reduction theory, by obtaining primary reinforcements (food, water, mates) reduces that innate drive and satiates us. That’s not true for all learning, we’re not always motivated by obtaining something.

18
Q

How was internal vs. external motivation measured experimentally?

A

In an experiment featuring the stop-watch and the watch-step task, Ps were put into two categories: those who were given a 200 yen reward for a correct trial, and those who were informed that their entire trial will get them 2000 yen in total. The non-performance based reward group spent more time practicing the task during the free-time they were given than the group that was given a reward after each trial. Furthermore, once in the second session both groups were told that they were not getting a performance based reward, the same pattern was observed!

19
Q

What is negative reinforcement contrast?

A

If rats are first trained to respond in order to obtain sugared water, and then the reinforcer is switched to food pellets, the rats rate of responding plummets.

20
Q

What is positive reinforcement contrast?

A

If rats are first trained to respond in order to obtain pellets, and then the reinforcer is switched to sugared water, the rats rate of responding doesn’t go up as much as we would expect. It has to be a huge increase to elicit a response.

21
Q

What is the problem with doing experiments with punishment?

A
  1. Punishment may not actually evoke stopped behaviour, but a general stopping of all behaviour.
  2. Using punishment can lead to more variable behaviour since alternatives are not precise
  3. Can encourage cheating, behavior occurs only when certain discriminating stimulus (called circumvention)
22
Q

What is concurrent reinforcement and why is is bad for OC? (3)

A

It can undermine punishment. For example getting punished by teacher, but reinforced by laughter. Or getting attention from parents despite being punished. It can also produce other emotions that impair behavior, agression and produces generalized behaviour disruption.

23
Q

What are three ways to make punishment effective?

A
  1. Contingency should always be in effect
  2. Initial Intensity should be strong (experiments show that increasing shock intensity was less effective than starting out strong)
  3. Don’t Use It At All! (Reinforce the behavior that prevents the unwanted behavior, used for children with severe autism).
24
Q

Describe the fixed ratio schedule.

A

Rule: Every X Rs produces 1 outcome

Responding: Steady responding upward until reinforcement
Post-reinforcement pause (flat line): time out from responding after each reward. Higher ratio of O to R = longer pause after each reward (often animals are consuming the reward during the pause)

25
Q

Describe the Variable Ratio schedule.

A

Rule: Every X Rs produces 1 outcome, but X changes with each reinforcer

Response: Behavior is constant and has a high rate of responding. Much faster behavior than the FR. Video games is one example.

26
Q

Describe the Fixed Interval Schedule.

A

Rule: After Y seconds, 1R produces 1O.

Response: Scallop shaped response, no responding at the beginning, rapid rate of response before interval expiration. (i.e., watching the clock for an appointment)

27
Q

Describe the Variable Interval Schedule.

A

After Y seconds, 1R produces 1O, but Y changes after each O.

Response: Behavior is steady but low rate of responding (i.e., checking emails)

28
Q

What is a schedule?

A

a pattern of behavioural contingency (If 10 Rs, then O or if 10 minutes and then R, then O)

29
Q

What is a concurrent reinforcement schedule?

A

When two or more schedules are presented at the same time (i.e., two levers on VI 2 and VI 4).

30
Q

What is the matching law of choice to behavior?

A

Response rates to concurrent VI schedules often respond to the rate of reinforcement for each schedule.