Schedules of Reinforcement and Choice Behaviours Flashcards

1
Q

schedules of reinforcement

A

indicates what has to be done for the reinforcer to be delivered; occurrence is followed by the reinforcer
- reinforcer delivery can depend on
1. presence of certain stimuli
2. passage of time
3. number of responses
4. etc.
- produce predictable patterns of B
- influences how instrumental responses are learned/maintained by reinforcement

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Why are schedules of reinforcement important?

A

determine:
- rate of instrumental behavior
- pattern of instrumental behavior
- persistence of instrumental behavior

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

schedules of reinforcement: schedule effects

A
  • highly relevant to motivation of B
  • whether person is industrious/lazy has little to do with personality
  • has more to do with reinforcement schedule
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

in the real world, instrumental responses _____ get reinforced each time they occur.
what is the name of this concept?

A

rarely
intermittent schedules of reinforcement

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

simple schedule of reinforcement

A
  • single factor determines which occurrence of the instrumental response is reinforced
  • i.e. how many responses have occurred
  • i.e. how much time has passed before the target response can be reinforced
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

schedules of reinforcement: ratio

A
  • depend only on the number of responses, time is irrelevant
  • reinforcement depends only on number of responses subject has to perform
  • reinforcement = delivered each time the set number of responses = reached
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

ratio schedules: CRF (continuous reinforcement)

A
  • each response results in delivery of reinfrocer
  • often part of contingency management programs for drug addiction rehabs
  • e.g. clean urine = money reward
  • e.g. entering correct ATM pin = let’s you withdraw cash
  • this is the only schedule where reinforcement is NOT intermittent
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

ratio schedules: partial/intermittent

A
  • responding reinforced only sometimes
  • enter correct ATM pin, BUT receive “out of order” sign
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

cumulative record

A
  • way of representing how a response is repeated over time
  • shows total (cumulative) number of responses that have occurred up to a particular point in time
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

ratio run

A

high and steady rate of responding that completes each ratio requirement

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

ratio strain

A
  • If the ratio requirement is suddenly increased (e.g., from FR 120 to FR 500), the animal is likely to pause periodically before completion of the ratio requirement
    in extreme cases: ratio strain may be so great that animal stops responding altogether
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

avoiding ratio strain

A

must be careful not to raise ratio requirement too quickly in approaching the desired FR response requirement

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

what is the best FR ratio when strengthening a new response?

A

CRF - FR1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

disadvantages of FR1

A
  • satiety and reduced effort due to it being so easy
  • time + resource consuming
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

what is the best approach regarding fixed ratio schedules in order to learn a B?

A
  • moving from a low ratio requirement (a dense schedule) to a high ratio requirement (a lean schedule).
  • should be done gradually to avoid “ratio strain” or burnout.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

At higher ratios, you can _________ the response at a higher/faster level.

A

At higher ratios, you can increase the response at a higher/faster level.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

fixed-ratio schedule

A
  • reinforcer earned at specific, predictable response instance in a sequence of responses
  • e.g. 10 responses per reinforcer = FR10
  • e.g. entering correct cell number (response) = FR; reaching the person = reinforcer
  • e.g. being paid per item manufactured in a factory
  • delivering a quota of 50 flyers (response); being paid (reinforcer) = FR50
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

fixed-ratio schedule: cumulative record

A
  • total nb of responses that have occurred up to a particular point in time or within specific interval
  • complete visual record of when + frequency of subject response during a session
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

fixed-ratio schedule: post-reinforcement pause

A
  • 0 rate of responding that typically occurs just after reinforcement on FR
  • controlled by the upcoming ratio requirement (nb of responses)
  • should be called pre-ratio pause: see an intimidating task ahead = you will pause
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

variable ratio (VR)

A
  • unpredictable amount of effort, nb of responses, required to earn the reinforcement
  • e.g. pigeon must make 10 responses in trial 1, 13 in trial 2, 7 in trial 3
  • predictable pauses in the rate of responding decreases in likelihood with VR than FR
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Real life examples of VR?

A
  • gambling
  • fishing
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

FR + VR responding rates are similar…

A
  • … provided similar nbs of responses required
  • … generally, FR responses have pause distribution, while VR schedule have pattern of steady responding
23
Q

fixed-interval schedule (FI)

A
  • amount of time that must pass before
  • constant from one trial to the next
  • e.g. pigeons only reinforced if peck after 4 minutes; pigeon learns to wait until the end of interval before INCREASING response rate toward end of each FI
  • addition of timing cue increases duration of post-reinforcement pause, BUT shifts responding time closer to the end of FI
  • determines when reinforcer becomes available or not
  • when delivered, subjects must still make instrumental response
24
Q

variable-interval schedules

A
  • time required to set up reinforcer = unpredictable
  • varies from one trial to next, unlike FI
  • subject has to respond to obtain the set-up reinforcer
  • maintain steady + stable rates of responding without regular pauses (unlike in FR)
  • found in situations where an unpredictable amount of time is required to prepare the reinforcer
25
Q

ratio vs interval schedules: FR + FI BOTH….

A
  • have post-reinforcement pause after each reinforcer
  • produce high rates of responding just before next reinforcer delivery
26
Q

ratio vs interval schedules: VR + VI BOTH….

A

have steady rates of responding without predictable pauses

27
Q

ratio vs interval schedules: VR + VI differences

A

motivate B differently
- VR induces more responses, motivates most vigorous instrumental B

28
Q

Why does VR induce a higher rate of responses?

A
  • due to short inter-response times (IRTs) or relationship between response rates and reinforcement
  • faster the organism completes VR requirement, the faster the organism will be reinforced
  • VI schedules favour waiting longer between responses
  • frequent responses before food is set up - short IRTs - will not be reinforced; more likely after interval has timed out
29
Q

Would you study more for fixed exams or pop quizzes?

A

pop quizzes (VI) instead of fixed exams (FI)

30
Q

Why have so many different schedules

A

different learning techniques

31
Q

reasons for high VR vs VI rates

A
  • reinforcement = consequences of responding
  • faster response schedule completed, the sooner the next reinforcer is obtained
  • in VI…. 2 min schedule, if organisms obtains each reinforcer, there is still a limit to the amount of reinforcers they can obtain for a certain amount of time
32
Q

feedback function

A
  • the relationship between response rates and reinforcement rates calculated over an entire experimental session or an extended period of time
  • reinforcement is considered to be the feedback or consequence of responding
33
Q

choice behaviour - concurrent schedule

A
  • can have 1+ response option/reinforcer
  • allows continuous measurement of choices because subject = free to change back and forth between alternatives
  • e.g. slot machines
  • investigates mechanisms of choice
  • often used in laboratory setting
34
Q

measuring choice behaviour

A
  • calculates relative rates of responding
  • assume the same VI schedule
  • relative rate of reinforcement = same
  • reinforcers would be equal to responses
35
Q

matching law

A
  • relative rate of responding = relative rate of reinforcement
  • if 2 alternative responses are not reinforced according to the same schedule, the relative rate of responding will still be similar to alternative
  • choice = not random
  • whether B occurs frequently or not depends on:
    1. schedule of reinforcement
    2. availability of alternative source of reinforcement
  • rates of response/reinforcement are averaged over duration of experiment
  • slide 14
36
Q

“…if 2 alternative responses are not reinforced according to the same schedule, the relative rate of responding will still be similar to alternative…” example

A
  • if situation 1 FR5 requires 5 responses to get 5 reinforcers, and situation 2 FR10 requires 10 responses for 1 reinforcer…..
  • they will still have similar relative rates of responding
  • situation 1 will simply occur twice as often as situation 2
37
Q

differences in the matching of response rates

A

can be accounted for by…
1. generalized form of matching law
2. response bias
- occurs when response alternatives require different amounts of efforts
- occur if reinforcer for one response is more attractive
2. sensitivity
- sensitivity of choice B to relative rates of reinforcement for response alternative
- undermatching: response ratio lower than reinforcement ratio
- overmatching: response ratio higher than reinforcement ratio
https://www.youtube.com/watch?v=kzrj0CSTq3Q&ab_channel=BrettDiNoviBehavioralKarma

38
Q

molar theories

A
  • refer to notebook
39
Q

molecular theories

A
  • refer to notebook
40
Q

melioration

A
  • to make something better
  • operates on time scale between molar and molecular theories
  • focuses on local rates of responding + reinforcement (based on time period that a subject devotes to a choice alternative
  • predicts subjects will shift their response choice toward alternative that provides highest local rate of reinforcement
  • involves assumption of melioration
41
Q

assumption of melioration

A
  • adjustments in distribution of B between choice alternatives will continue until subject responses have same local rate of reinforcement on each alternative
  • matching law ^
  • 3rd mechanism of choice
42
Q

concurrent chain schedules

A

standard concurrent schedule of reinforcement
- 2+ response alternatives available
- switching can occur any time
in complex B, once a behaviour is made, options are then limited

43
Q

concurrent chain schedules: studying how choices involve commitment to one choice

A

stage 1: choice link
- subject chooes between 2 schedules
- either A or B
stage 2: terminal link
- opportunity occurs after initial choice
- “chained” to it until end of schedule’s end of trial
refer to drawing in notebook

44
Q

self-control

A
  • chooing a large delayed reward over an immediate smaller reward
  • why is it hard to work for large but delayed rewards ?
45
Q

delay discounting

A
  • value of a reinforcer declines as function of how long you have to wait to obtain it
  • would you want 50$ now or next week?
  • value of reinforcer directly related to reward magnitude
  • slide 24
46
Q

training self-control

A

shaping
- training that more preference for larger delayed reward, where delay is increased gradually in steps
introducing a distracting task during the delay period may also increase self-control
- therefore preference for a delayed larger reward

47
Q

fMRI

A

imaging method
- measures haemodynamic response of brain to neural activity
- when there is more blood flow in a specific
oxygen is delivered to neurons by …
- hemoglobin = oxygen transport protein in red blood cell
- when neuronal activity increases, increased demand for oxygen
most common fMRI method = BOLD (blood oxygenation level dependent signal)
- BOLD corresponds mainly to the concentration of deoxyhemogloblin
increase in blood flow produces increase in ratio of oxygenated hemoglobin relative to deoxygenated hemoglobin in certain active area
- allows fMRI to produce effective map of which neurons are active
- refer to slide 32

48
Q

reward and movement: basolateral amygdala

A

a) index of reward magnitude and valence
- positive valence = pleasant emotional stimulus
- negative valence = unpleasant emotional stimulus
b) can motivate B and reinforce new learning
- drug paired cues can increase drug-taking; can be eliminated by lesioning BLA
- lesioning BLA also disrupts escape from fear-eliciting cue
- BUT lesioning BLA has no effect on devaluating reward

49
Q

biological view of reward: striatum

A
  • caudate, putamen, nucleus accumbens
  • modifies B by integrating positive/negative outcomes via 2 pathways, eventually projecting to thalamus
    1. direct pathway (slides 35-36):
  • disinhibits thalamic nuclei
  • inhibition of inhibitory regions
  • activation of direct pathway results in more output from the thalamus (because it is disinhibited)
    2. indirect pathway (“no go”)
  • to suppress B
    refer to diagram in notebook
50
Q

rewards: rats vs humans

A

rats
- DA microinjection/electrical stimulation in striatum = highly rewarding
- striatal lesions diminish responding and increase sensitivity to reward devaluation
humans
- disruption of DA input from substantial nigra results in movement abnormality (Parkinson’s)

51
Q

reward: OFC

A
  • orbitofrontal cortex
  • PFC anatomically connected to amygdala + striatum; connected to reward
  • OFC implicated in executive control; weighing relative value of each ALTERNATIVE CHOICES
  • OFC reward actions = greater activation of medial OFC
  • response inhibition = greater activation of lateral OFC
52
Q

damage to OFC: effects

A
  • value of predicted outcome can not be used to guide B
  • difficulty incorporating negative feedback from previous B to guide future B (what did I do wrong?)
  • B of such people = controlled by impulsive amygdala
53
Q

reward: instrumental learning

A
  1. lower level subcortical structures
    - includes amygdala + striatum
    - learning occurs in incremental fashion
    - guided by predictability and signal error
  2. OFC
    - capable of biasing/changing B based on new info
    - weighs benefit of delayed reward + set goals
    - allows delayed gratification to pick larger reward
    - in rats, greater activation of lower level limbic areas occurs with immediate reward selection
54
Q

addictive B

A
  • choose immediate outcome/reward despite knowledge of long term negative effects
  • opioid addicts show undervalue of delayed reward
  • may be due to overactive impulsive system (guided by amygdala and striatum)
  • abuse of drugs artificially change reward system bias towards impulsivity