Lecture 1: Increasing Behaviour Flashcards
Define Reinforcement
An increase in the future probability of
a behavior as a result of a contingent
consequence.
The relation between a behaviour and
a consequence. The future rate of
behaviour has to increase in order for
it to be called a reinforcer.
What is Thorndikes’ the Law of Effect
Thorndike studied learning in animals
(usually cats). He devised a classic
experiment in which he used a puzzle
box to empirically test the laws of
learning.
He placed a cat in the puzzle box,
which was encourage to escape to
reach a scrap of fish placed outside.
Thorndike would put a cat into the box
and time how long it took to escape.
The cats experimented with different
ways to escape the puzzle box and
reach the fish.
Eventually they would stumble upon
the lever which opened the cage.
When it had escaped it was put in
again, and once more the time it took
to escape was noted. In successive
trials the cats would learn that
pressing the lever would have
favorable consequences and they
would adopt this behavior, becoming
increasingly quick at pressing the
lever.
Edward Thorndike put forward a “ Law
of effect ” which stated that any
behavior that is followed by pleasant
consequences is likely to be repeated,
and any behavior followed by
unpleasant consequences is likely to
be stopped.
The graph illustrates that it took cats
40 trials for them to learn which levers
to push to get out of the box quickly.
An aquisition curve or learning effect is
shown in the graph as cats continue to
engage in previously reinforced
behaviours.
Successful behaviours at opening the
box are stamped in and unsuccessful
behaviours are stamped out – law of
effect. Thus, learning is a
gradual/incremental procress that is
not smooth (ups and downs).
The law of effect depicts trial and error
learning and suggests that learning is a
gradual process.
What is insight learning?
A different type of learning.
Look at situation, analyse it cognitively,
and find a solution.
Chimps placed in room with a banana
at the top of the rope they couldn’t
reach. They moved box to reach it.
This is learning without reinforcement.
It is planning and learning without trail
and error.
However, the hypothesis was flawed
and chimps did use trial and error
learning (i.e., their is trial and error
leanring in insight learning). For
example, they tried jumping it doesn’t
work and they try another solution.
Insight learning for reading would be
to emerse them in and expose them to
a rich linguistsic behaviours.
Wolfgang Kohler’s experiments
suggest that learning occurs via insight
or a sudden realisation as to the
solution of a problem that is arrived by
a reasoning process.
These to theories are not mutually
exclusive. You can learn with both.
Insight and trial and error may be
relevant: students may require time to
learn but others my get it right away.
Prehaps insight learning is better for
some conditions and trial and error
learning is better for others?
Is reinforcement or punishment better?
It is far better to build in reinforcement for appropriate behaviour than to wait for a child to screw up and punish them.
(4) Types of Reinforcement
Natural & Contrived
Reinforcers that are naturally occurring in the environment such as praise (natural) or one that is manipulated/organized by the teacher such as providing break when they successfully finish a task or monetary rewards (contrived)
In ABA we look to place behaviour under the control of natural reinforcers to promote maintenance and generalization.
Primary & Secondary/Conditioned
Reinforcers with biological significance such as food, drinks, shelter, or warmth (primary) and can be natural or contrived. Reinforcers that you have to learn are valuable such as praise, money, or special privileges (secondary).
The distinction between them is arbitrary and depends on the individuals preferences about what is considered a primary or secondary reinforcer.
Positive & Negative
Addition of a stimulus such as praise, token, or reward (positive) or removal of an aversive stimulus umbrella for the rain, cream to remove itch, doing behaviour to stop parent yelling at them etc. (negative).
Both increase future rate of behaviour.
Theoretically this distinction is also arbitrary is eating adding positive stimulus or removing hunger, umbrella when raining introducing dry or removing rain, or putting on jacket removing cold or adding warmth? The distinction has applied significance. Positive reinforcement – adding positive stimulus, and negative reinforcement – removing negative stimulus.
Immediate & Delayed
Usually in seconds
In ABA we aim to teach students to tolerate an acceptable amount of delay to receive reinforcement.
Learning occurs quicker when behaviour is immediately reinforced and if the delay is too long learning may not occur.
(4) Ways to Identifying Reinforcers
Indirect survey
o parent, teacher, or child can fill out
about what students like -positive
reinforcers and dislike – negative
reinforcers
o may not acutally reflect how students
will repsond (people can’t predict
child behaviour accuratley and saying
is different to doing)
Free operant assessment
o Direct
o Observe what child does in free time
to identify potential rienforcers
o More time – reinforcer (if do behaviour
x, you can have the opporunity to
engage in a preferred activity)
o Less time – not reinforcer
o It requires teachers to be able to
restrict access to the activity to be able
to use it contigently to reinforce
appropriate behaviours
Direct preference assessment using
choice methodology
o Single (successive) item presentation
(stimuli present one at a time)
o Simultaneous/multiple stimulus
presentation with or without
replacement
Reinforcer sampling
o Providing potential reinforcer
noncontingently for child to see if they
like it. If they do, then you can provide
it contingently to an appropriate
response (i.e., music, food, or activity).
o Give them a litle taste or sample of the
reinforcer.
Steps for Identifying Reinforcers v
Consider age, interests
(developmentally and age appropriate)
Magnitude and type of reinforcer
should match response effort (how
much work needed to get how much
reinforcer; need to be proportional)
Create a menu of varies reinforcers &
allow choice (avoid satiation)
Use the Premack Principle (make high
rate behaviour contingent on low rate
behaviour)
Ask the person
Try sampling to introduce novelty
Aim to remove contrived reinforcers
Follow the child’s momentary
interests, follow the child’s lead
Collect data
To maximize reinforcing effects consider
Contingency (ensure reinforcer can
only be obtained after producing the
desired response)
Immediacy (more immediate the
better)
Magnitude (higher magnitude the
better; proportional to effort of
behaviour)
Ratio (schedule of reinforcement;
denser the better to learn behaviour
and then thinned to maintain it)
(8) Schedules of Reinforcement
Each schedule of reinforcement
produces a distinct pattern of
responding.
Continuous (CRF)
o Better for learning new behaviour.
o Each behaviour is reinforced.
o It leads to satiation (stop responding
when reinforcer loses value)
o It does not reflect the frequency of
natural reinforcers.
o Start with CRF and then thin schedule
of reinforcement.
Fixed ratio (FR)
o A fixed number of responses must be
made to gain reinforcer (e.g., 10 math
problems)
o Produces steady rate of responding
until they gain reinforcer, and they
pause until they start responding
again to gain a reinforcer. Longer
pauses occur when more responses
are needed to gain reinforcement.
o For example, procrastinating writing
large assignment. If we break it into
smaller sections we are more likely to
start writing their assignment earlier.
Variable ratio
o Roughly every _ number of responses
leads to reinforcement
o Gambling leads to consistent and
steady rate of responding and
addiction.
o Steepest line with higher rate of
responding and consistent
performance.
Fixed interval
o A fixed length of time to elapse before
behaviour can earn reinforcer (i.e., 1
question per minute).
o Scallop pattern when students pause
and a rapid increase in the rate of
responding at the end of the interval
to gain reinforcement and pause
again.
Variable interval
o Roughly __ length of time elapses
before behaviour earns a reinforcer
(i.e., on average one question per
minute).
o A steady rate of performance (not as
steep as VR) but is steady and
consistent.
Fixed duration
o Perform the behaviour for a fixed
period of time to earn reinforcer (i.e.,
read textbook for an hour)
Variable duration
o Perform the behaviour on average __
length of time to earn a reinforcer (i.e.,
on average reading the textbook for
an hour).
Extinction
o No longer reinforcing a previously
reinforced behaviour to reduce rate of
behaviour until it stops.
Two Examples of Negative Reinforcement
Example 1:
o Aversive Stimulus = Sun strike, Glare
o Response = Sunglasses on, Visor down
o Negative reinforcer = Glare is reduced
and you can also see better
Example 2:
o Aversive Stimulus = Raining
o Response = you put on your raincoat
and use an umbrella
o Negative reinforcer = Stay dry
What is an extinction burst?
A problem that arises when
reinforcement is withheld from a
previously reinforced behaviour.
A temporary increase in behaviour
following extinction before the rate
lowers back to baseline (increase
intensitiy of behaviour; it gets worse
before it gets better; its how you know
extinction gets better).
What is a Token Economy?
A system in which the learner earns
tokens for engaging in a targeted
behaviour.
Components of Token Systems
Operationally defined behaviours that
will earn tokens.
Rules regarding how many tokens will
be earned for what behaviours.
Rules regarding the exchange of
tokens for reinforcers.
An Analysis of token reinforcement using multiple-schedule assessment (Fiske et al., 2020)
Aim
o To evaluate the relative reinforcing
efficacy of individual components of a
token economy compared to other
reinforcement types and schedules.
Participants
o Jack (5), Hank (10), Annie (11), and Cam
(8).
Reinforcers
o Goldfish crackers, toy, or book.
Conditions
o Extinction – no consequences for
responding for up to 5 min
o Tokens without back-up – tokens
(pennies) were given but they were not
exchanged for any primary reinforcers
o Yoked FR – access to reinforcers after
making 5 responses
o Tokens with back-up – tokens
(pennies) were given and when 5 were
earned they were exchanged for
primary reinforcers
o Primary – access to primary
reinforcers after each response
Results
o Access to primary reinforcers
produced higher rates of responding
(1st) and tokens with back up was the
second most effective reinforcement
strategy.
Discussion
o Results highlight the importance of
initially having a low response cos/high
reinforcement rate.
o Results also highlight the importance
of having highly preferred back-up
reinforcers.
o That token systems are not as
effective as primary reinforcers but if
you do use it you need to provide
access to a highly preferred back up
reinforcer (students don’t just respond
to tokens, they want to trade the token
in for a primary reinforcer).
Comparing the Effectiveness and Ease of Implementation of Token Economy, Response Cost, and a Combination Condition in Rural Elementary School Classrooms (DeJager et al., 2020)
Conditions
o TE – Earn tokens for appropriate
behaviour.
o RC – Lose tokens for inappropriate
behaviour.
o CB – earn tokens and lose tokens for
appropriate and inappropriate
behaviour, respectively.
Response Definitions
o Engagement – the student was looking
at materials, raising a hand, working
on specified tasks, and/or engaging in
classroom relevant communication.
Recorded during 25-min sessions
using 15-s momentary time sampling.
o Inappropriate behaviour – fidgeting,
drawing on self, talking out, interacting
with peers in ways that interfered with
learning, leaving assigned area,
disruptive vocalizations with sufficient
intensity to draw teacher’s attention.
Momentary Time Sampling
o An interval recording strategy that
involves observing whether or not a
behavior was occurring at the end of a
specified time interval.
o For example at the end of each 15-s
interval.
o So in a 10—min session there would
be 40 intervals and in a 25-min session
there would be _______ intervals.
Method
o 2, 1st Grade Classrooms in USA
o 25-min sessions
o Recorded problem behaviour and
academic engagement
Results
o Alternating treatment design
comparing token economy, response
cost, and combined.
o The challenging behaviour was lowest
in token economy or token economy
and response cost conditions. You can
maintain reductions in challenging
behaviour by rewarding appropriate
behaviour without having to punish
challenging behaviour.
o Increase in engagement.
o Problem of the first instance, you can
only reinforce engagement if children
already are engaged. How do you get
behaviour you want to occur to occur
so you can reinforce it?
o Token economy, reduced problem
behaviour and increased engagement.