Chapter 6 Flashcards
schedule of reinforcement
a rule determining if a response will be followed by a reinforcer
schedules influence how a response is ______ and ________
learned; maintained
simple schedules definition and 4 types
a single factor determines occurrence of the reinforcer
Ratio (fixed and variable)
Interval (fixed and variable)
Ratio schedule
reinforcement depends upon the number of responses performed/response accumulation
continuous reinforcement
every time the response is occurring, so does the reinforcer
partial reinforcement
response is reinforced only some of the time
fixed ratio (FR)
a fixed ratio between the number of responses necessary to produce the reinforcer
which one produces the most vigorous responding: continuous reinforcement (FR1) or partial reinforcement (FR50)?
FR50
3 FR characteristics
post-reinforcement pause
ratio run
ratio strain
post-reinforcement pause
decrease in responding just after a reinforcer
ratio run
a high steady rate of responding that completes the ratio (usually between reinforcers)
ratio strain
rapid increase in FR requirement results in long pre-reinforcement pauses
increase requirement where animal stops and breaks, usually resumes after a while
higher requirement = more likely to see ratio strain
reinforcement is predictable: T/F?
False; unpredictable!
Variable ratio (VR)
a different number of responses are required for reinforcement
average of responses = VR
characteristics of VR when compared to FR
fewer post-reinforcement pauses
fewer ratio runs
more resistance to ratio strain
interval schedules
responses are reinforced if they occur after a certain amount of time
Still have a response requirement
fixed interval schedule (FI) and example
the time between reinforcers is constant
ex. washing clothes in a washing machine, when started it tells you how long it will take
variable interval (VI) and example
time between reinforcers is variable/not constant
average = VI, won’t know average until done multiple times
ex. calling to see if you car is fixed, sometimes 45 mins other time 3 hours
2 characteristics of FI
responses cluster around reinforcer delivery, aka FI Scallop
- ex. silly first-years studying night before exam
depend upon the ability to perceive time
- ex. visual stimuli increased scalloping, modern example is google calendar
3 characteristics of VI
VI schedules support steady, stable rates of response
once time has past, the response will be reinforced
limited hold
limited hold
in some instances, a restriction can be placed on the length of time a reinforcer will be available
ex. a surfer waiting for the perfect wave, if passing up too many waves to get a different one you may miss your chance to surf!
inter-response time (IRT) and what happens if short vs. long IRTs are reinforced
interval between responses
if short IRTs are reinforced, increase responding
if long IRTs are reinforced, decrease in responding
response-rate schedules and example
requires a certain number of responses at a specified rate
ex. assembly line:
too fast = piss off others
too slow = shut down line
just right = team player
2 types of response-rate schedules
differential reinforcement of high rates (DRH)
differential reinforcement of low rates (DRL)
differential reinforcement of high rates (DRH) and example
responses are reinforced only if accumulated before a give time
encourages a high rate of responding
ex. At DRH 12, a rat must press lever more or = 12 times/min in order to be reinforced
differential reinforcement of low rates (DRL) and example
responses are reinforced only if restrained in given time
encourages a low rate of responding, looking at self control
ex. At DRL 3, a pigeon can peck less than or = 3 times/min for reinforcement
Spot Check: In research methods you need to answer at least 5 questions and hand them in at the end of each week to get full credit for the class. This is an example of:
a fixed ratio schedule
a variable ratio schedule
a fixed interval schedule
a response-rate schedule
a response-rate schedule, specifically DRH
2 common techniques for studying choice
skinner box
concurrent schedules
relative rate of responding
RR
Ra
RRa for key A = ——–
(Ra+Rb)
Rb RRB for key B = -------- (Ra+Rb)
meaning of RRa > 0.5?
schedule a > schedule b
meaning of RRa = 0.5?
schedule a = schedule b in preferences
relative rate of reinforcement
rr
calculated same as RR:
ra rra for key A = -------- (ra+rb) rb rrB for key B = -------- (ra+rb)
we chose/respond (more/less) to things that are reinforced
more
matching law
relative rate of responding (RR) on a given alternative is approximately equal to the relative rate of reinforcement (rr) earned on that alternative
matching law equation
RRa = rra
Ra = ra
—– ——-
Rb rb
matching law is affected by 3 variables…
sensitivity (s)
bias (b)
reinforcer value
Sensitivity (s) in matching law and the equation
tendency to choose a particular schedule, despite loss of reinforcement
s
Ra = ra
—– ( ——- )
Rb rb
undermatching in sensitivity
choice responding less than predicted, s < 1.0
matching law predicts 2:1 ratio if choice is less than 2:1 (ex. 1:1)
overmatching in sensitivity
choice responding more than predicted, s > 1.0
matching law predicts 2:1 ratio, if choice is greater than 2:1 (ex. 3:1)
bias (b) in matching law and equation
tendencies to certain responses and/or reinforcers (not about schedules!)
if b > 1.0 = more preferred
if b < 1.0 = less preferred
s Ra = ra b ----- ( ------- ) Rb rb
response vs. reinforcer bias
response: how would you rather respond?
ex. drinks at the bar or drinks at home?
reinforcer: what would you rather receive?
ex. diet coke or dr. pepper?
have to learn biases by testing (ideally) ______ organisms
multiple
reinforcer value in matching law (3 features)
reinforcer features influence rate of responding (R)
ex. amount
ex. palatability
ex. immediacy
basketball matching law example
in basketball, players choose:
- RRa (3 pointers): further but more points (rra)
- RRb (2 pointers): easier but less points (rrb)
shots (RR) were more proportional to the shooting percentage (rr) of those shots, don’t match = lose
3 levels of choice
molecular
melioration
molar
levels of choice: molecular
individual responses (choosing A > B)
levels of choice: melioration
we respond to improve local rates of responding
levels of choice: molar
sum of responses (all choices, no matter A or B)
molecular vs molar maximizing
molecular: choosing the response that is best at a single point in time
molar: choosing the response that will maximize reinforcement over the long run
lab examples of molecular and molar maximizing
molecular: what key light does a pigeon choose to peck in a single instant
molar: how many lever presses does a rat make on 2 levers over 3 days
melioration: local rate definition
time a subject responds to a particular alternative
ex. lever pressing 60 times in 60 minutes, but all occur in first 30 min
overall rate (molar): 1/min BUT local rate (in 1st 30 min): 2/min
Spot check question:
It’s girl-scout cookie season. We all have our favorites. You prefer samoas, your friend is a thin mints lover. These preferences are examples of response biases.
true or false
false! response bias is how you would respond/get to the cookies while reinforcer bias is which you would rather
concurrent Chain schedule
testing choice and self control
terminal link and choice link
concurrent Chain schedule: terminal link
reinforced, second choice made that leads to schedule of reinforcement
concurrent Chain schedule: choice link
not reinforced, first choice
once chosen they are committed to the choice
self control and test used to test it
choosing a large delayed reward over an immediate small reward
marshmallow test
does a pigeon chose a delayed large reward or small concurrent chain schedule
delayed large
how to quantify self-control
value discounting function
value discounting function and equation
the value of the reinforcer is reduced by how long you have to wait for it
M V = ------------ (1 + kD)
M
V = ————
(1 + kD)
v = value of reinforcer
M = reward magnitude
k = decay parameter, tells you how influential decay will be on reinforcer
D = reward delay
in the value discounting function: as D increases, the value of the reward …
decreases
In the value discounting function, what happens if D = 0 and V = M
you receive reward immediately
consequences of the VDF (value discounting function) and three terms
as reward value decays over time, choice is shifted:
T0 (onset): the reward value for “large” is greater, no decay
T1 (early): immediate small reward preferred if large reward value decays with delay, like “direct choice”
T2 (late): in long delays, large reward retains value and is preferred, like “concurrent schedule”
the longer we wait = _________ the likelihood of choosing large reward
increases
in the VDF, what does a small/large k signify? what study showed this?
small k = shallow function, increase in self control
large k = steep function, decrease self control
Madden et. al. did a study with heroin users and control groups to observe self control, heroin users had a large K = less self control, debated on learned from environment or genetically disposed
can we teach self control?
yes