Chapter 6: Reinforcement and Choice Flashcards

You may prefer our related Brainscape-certified flashcards:
1
Q

Intrinsic reinforcer

A

obtain reinforcing value while engaging in the behvaiour
intrinsically motivating
social contact
exercise

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Extrinsic reinforcer

A

things that are provided as a result of the behaviour to encourage more behaviour in the future

ex. reading in children
the only way to teach kids to read is to get them to read and this usually involved enticing with them with social reinforcement (saying “good job”) or other kinds of external reinforcement

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

More reward does not always mean more

____________. Why?

A

Reinforcement

bonus for making more parts (i.e. over 50)
only diff between the group that does not get a bonus and the groups that did get a bonus (no difference between the groups who all got bonus but diff bonus values, they all were equally reinforced)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

aversives can __________ behaviour

A

reinforce

the aversiveness drives the behaviour!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Continuous Reinforcement

A

Behaviour is reinforced every time it occurs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Ratio Schedules

A

Reinforcer is given after the animal makes the required number of responses

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Fixed ratio (FR):

A

fixed ratio between the number of responses made and reinforcers delivered (e.g., FR 10)
• Key elements: postreinforcement pause, ratio run, and ratio strain (going from 1 peck to 100 pecks, subjects tend to stop responding)

see graph slide 9

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Cumulative Record

A

Based on old cumulative recorder device (Constant paper output, pen jumps with each response)
rate of responding across time!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Variable Ratio (VR):

A

Different number of responses are required for the delivery of each reinforcer

Value is equal to the average number of
responses made to receive a reinforcer (VR 5)

Responding based on average VR and
minimum

ex. gambling

see graph slide 19 (steep slope of responding
responding at high rate, no postreinforcement pause!)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Interval Schedules

A

Responses are only reinforced if the response

occurs after a certain time interval.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Fixed interval (FI):

A

a response is reinforced only if it occurs more than a set amount of time (responses during the interval don’t matter)

Key elements: fixed interval scallop, limited
hold

i. e. have to wait 10 secs, after 10 secs has elapsed, the first peck at the key will gain them the reward!
ex. cramming before tests

see graph slide 16 (low rates of responding
scallopes responding, post-reinforcement pause!)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Variable interval (VI):

A

responses are reinforced if they occur after a variable interval of time

see graph slide 19

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Reynolds 1975

Ratio and Interval Schedules Compared

A

• Compared rates of key pecking of pigeons on
VR and VI schedules
• Opportunities for reinforcement were made
identical for each bird
• The VI bird could receive reward when the VR
bird was within one response of its reward

With equivalent rate of reinforcement, variable ratio schedules produce a higher rate of responding than variable interval schedules

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Variable schedules produce _________ responding compared to Fixed

A

Variable schedules produce steadier responding compared to Fixed

fixed = post reinforcement pause

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Ratio schedules produce ________ of responding than Interval

A

Ratio schedules produce higher rates of responding than Interval

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Source of Differences Between Ratio and Interval Schedules:

Differential reinforcement of Inter-response times

A

Ratio schedules reinforce shorter IRTs

Interval schedules reinforce longer IRTs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Source of Differences Between Ratio and Interval Schedules: Feedback function

A
More feedback (reinforcement) comes with more
responding on Ratio schedules; not so for Interval Schedules (different jobs differ on this aspect)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Intermittent Schedules

A

Fewer reinforcers needed
More resistant to extinction

Variable reinforcement/interval schedules are resistant to intinction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Differential reinforcement of high rates (DRH)

A

Minimum ratio per interval

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Differential reinforcement of low rates (DRL)

A

Maximum ratio per interval

to get rid of a behaviour if you don’t hate it but want it to occur less

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Differential reinforcement of paced rates (DRP)

A

Maximum ratio per interval (DRH)

Minimum ratio per interval (DRL)

22
Q

Duration Schedules:

A

response must be made continuously for a period of time

23
Q

Complex schedules:

A

Conjunctive schedules, Adjusting schedules, Chained schedules…

24
Q

Noncontingent Schedules: Fixed time (FT)

A

Reinforcer occurs following predictable amount of time regardless of behaviour

25
Q

Choice

A

Usually considered as a cognitive deliberation
- Here measured based on effect of different,
concurrent payoff schedules

With true and fickle “freedom of choice”, choices would be unpredictable
- Understanding choices in terms of consequences allows for prediction

with concurrent schedules!

26
Q

Herrnstein, 1961

matching law

A

would the animal be able to figure out how to respond based on how much food it gets??

measurement of choice
concurrent schedules

27
Q

Measures of choice

A
  1. Relative rate of responding (Behaviour)
    BL/(BL + BR)

BL = rate of responding to left choice
BR = rate of responding to right choice
(BR + BL) = total responding

  1. Relative rate of reinforcement
    RL/(RL + RR)
28
Q

Matching law

A

Herrnstein, 1961:
Proportion of responding (choice) is equal to the proportion of reinforcement for doing so.

There is a correlation between behaviour and
the environment.

the first equation is = to the second they are proportional = its called the matching law

Relative rates of responding match relative rates of reinforcement

BL/(BL + BR) = RL/(RL + RR)
OR
BL/BR = RL/RR

29
Q

Basket ball matching

A

26 players on a large university basketball team
Relative choice of different shot types = relative rate of reinforcement (baskets made)

VR, because you may need to make some shots before taking a 3 pt one

30
Q

BL/BR = b(rL/rR)^s

real matching law

A

real matching law
b = bias
s = sensitivity

Perfect matching, s = 1
• Undermatching, s 1

31
Q

Undermatching

A

Undermatching, s

32
Q

Overmatching

A
s > 1
– Increased sensitivity to rates of
reinforcement 
– “Stick to the best option” 
– Common with high cost of switching

respond a lot to the best option common if it is costly to switch (long change over delay) will rarely sample from other options that pay off at lower rates

33
Q

Response Bias

A

Important when there is a difference between operant behaviours
- Commonly: side bias

Important when there is a choice between
reinforcers or responses
- Biological predispositions
- Quality

34
Q

Matching Law and Simple Schedules

A

Rate of operant (Bx) and rate other (BO) activities

= Bx/(Bx+Bo) = rx(rx+ro)

35
Q

Matching law describes ________, but does not ________

A

Matching Law describes the behaviour, but

does not explain it

36
Q

Maximizing theories

A

Organisms distribute their behaviour so as to obtain the maximum amount of reinforcement over time
Explains ratio schedule choice
Doesn’t always hold

37
Q

Melioration theories

A

Making the situation “better” than the recent past

Change from one alternative to the next to improve the local rate of reinforcement

Animals respond so that the local rate is the same on each alternative

Predicted issue: Behaviour is strongly controlled by immediate consequences

38
Q

Serial Reversal Learning

A

Over many reversals, re-acquisition speeds up

Compare with initial acquisition and number of reversals
- Behavioural Flexibility

39
Q

Mid-session Reversal

A

With a reversal part- way through the session:

  • Perseverative Errors
  • Anticipatory Errors
  • Errors shift with changes in time
  • Pigeons predict reversal based on time
  • Self-Control?
40
Q

Concurrent Chain Schedules

A

• Method to determine choice
– e.g., whether variety is preferred
• Different from concurrent schedules since
animals are not free to switch
• Able to investigate choice with commitment

41
Q

Self-Control

A

Commonly used as “willpower”

  • Circular logic
  • Describes outcome, not process

Better described as:
- 1Choice of impulsive vs. delayed options

42
Q

Temporal Self-Control

A

Choose: Smaller-Sooner reward (SS) vs. Larger-Later reward (LL)
• Self-control vs. impulsivity

43
Q

Waiting in Animals

A

Different species tolerate delays differently e.g., time they will wait for a threefold increase in reward

  • Temporal Self-Control
  • chimps are very good at this!
44
Q

Waiting in Humans (Rosati et al. (2007))

A
  • Temporal Self-Control
  • chimps are very good at this!
  • chimps are better than humans (expect for when the reward is money)
45
Q

Rachlin & Green (1972)

A

• Pigeons chose small reward when no delay
in Phase 1
• In phase 2, pigeons chose large, delayed reward when T between initial and terminal phases was increased

46
Q

Delay-discounting

A

Value discounting function : value (V) is directly related to magnitude (M) and inversely related to delay (D), or
V = M/(1 + KD)

V = value of a reinforcer M = magnitude K = discounting parameter D = delay

47
Q

Madden et al. 1997

A

•Madden et al. 1997 Opioid-addicted participants steeply discount delayed money and (especially) heroin

it is a tendency to give greater value to rewards as they move away from their temporal horizons and towards the “now”.

48
Q

Small-But-Cumulative Effects

A

Malott (1989)
Each choice of an SS reward over a LL reward has only a small effect
- Builds over time
- Difficulty in impulse control
- Establishing rules for acceptable vs. unacceptable behaviour
- Relapse handling: Dealing with steady stream of temptations

49
Q

Long-term Effects

delayed gratification

A
Mischel: Delayed Gratification
Eating the first marshmallow / less-
preferred food correlated longitudinally with:
- Lower SAT scores
- Less educational and professional
achievement
- Higher rates of drug use
50
Q

Simple Self-Control Methods (Skinner)

A
Physical restraint 
Deprivation/Satiation 
Distraction 
DRO 
Self-Reinforcement 
Self-Punishment 
Shaping
51
Q

Clinical Implications of self control issues

A
ADHD Predominantly Hyperactive-Impulsive Type 
Substance abuse disorders
 Impulsive overeating
Other impulse-control disorders 
Pathological gambling