Instrumental Conditioning Flashcards
Instrumental Conditioning:
the learning of a contingency between behaviour and consequence
instrumental conditioning involves explicit training between
voluntary behaviours and their consequences
a specific behaviour leads to
a specific consequence
“stamping in” and “stamping out” determines
whether a behaviour was maintained or eliminated respectively
Edward L. Thorndike and his puzzle box what behaviours were stamped in
Behaviours like rope pulling were stamped in because they were followed by the favourable consequence of access to food
Edward L. Thorndike and his puzzle box what behaviours were stamped out
random behaviours like turning in a circle, were stamped out
“stamping in” and “stamping out” leads to in regards to Thorndike puzzle box
process leads to refinement, and the cat learns the contingency between the specific behaviour of rope pulling and
the specific consequence of food reward
Law of Effect:
Behaviours with positive consequences are stamped in and produced more frequently
Behaviours with negative consequences are stamped out and produced less frequently
4 different types of instrumental conditioning
- Presenting a positive reinforcer
- Removing a positive reinforcer
- Presenting a negative reinforcer
- Removing a negative reinforcer
Reward Training
the presentation of a positive reinforcer following a response which increase the frequency of the behaviour
An example of reward training
if you present your puppy with a treat every time he sits on command, the behaviour is likely to increase
Punishment Training:
the presentation of a negative reinforcer following a response that decreases the frequency of the behaviour
An example of punishment training is:
if little Billy teases his sister, and his mother tugs his ear and scolds him, he will likely decrease the behaviour
Omission Training
removing a positive reinforcer following a response that decreases the behaviour
An Example of omission training is:
little Billy is doing his 2 favourite things, watching his favourite TV show and teasing his sister
- Billy’s mom wants to eliminate the teasing behaviour
- she decides to turn of the TV for 30 seconds every time Billy teases Sally
- access to the TV show is a positive reinforcer and removing it, will likely cause Billy to stop his teasing behaviour
Escape Training:
removing a negative reinforcer following a response that increases the behaviour
a constant negative reinforcer being presented that the learner is motivated to have remove
An example of escape training is:
ex. the floor of one side of a rats cage delivers a constant mild electric shock, it can be avoided if the rat moves to the opposite side of the cage
Instrumental conditioning is that is proceeds best when (timing)
he consequence immediately follows the response
Acquisition
When an organisms learns the contingency between a response and its consequence
auto shaping is :
For simple behaviours
and can be learned without the careful guidance of the researcher
shaping by excessive
approximation:
the complex behaviour can be organized into smaller steps which gradually build up to the full response that we hope to condition
Discriminative Stimulus (SD/S+):
signals when a contingency between a particular response and reinforcement is on
S∂ (S-):
a cue which indicates when the contingent relationship is not valid
An example of Discriminative Stimulus (SD/S+):
ex. the environment of the childs parents home becomes an SD for the response of vegetable eating behaviour which is reinforced with access to a dessert reward
An example of S∂ (S-):
ex. the environment of the grandparents house becomes an S∂ for the response of vegetable eating
SD (S+ )and S∂ (S-)
are cues that predict wether or not the contingent relationship is valid or not
In contrast to classical conditioning the CS is paired with a US to elicit a response The SD itself does not
elicit the response… the SD sets the occasion for a response by signalling when the response reinforcer outcome relationship is valid
Continuous Reinforcement:
in all the above examples, a response leads to a reinforcer on every trial
Partial Reinforcement:
can have reinforcement delivery
determined by either the total responses or time
4 basic schedules of reinforcement
Fixed Ratio (FR-#) Variable Ratio (VR-#) Fixed interval ( FI-#) Variable interval (VI-#)
Ratio Schedules
- Fixed Ratio or Variable Ratio
Based on the # of responses made by a
subject which determines when reinforcement is given
Interval Schedules
Fixed Interval or Variable interval
based on the times since the last response
that was reinforced
FI- 10 Min Schedule
ex. a pigeon on a FI-10 min schedule, is rewarded with
food for the first pecking response after a 10 minute
period
- over an hour, the pigeon has the potential to only earn 6
food pellets
Fixed Ratio
A fixed-ratio schedule of reinforcement means that reinforcement should be delivered after a constant or “fixed” number of correct responses
subjects on this a fixed ratio schedule display a “pause and run” pattern
Variable-Ratio Schedule (VR)
When using a variable-ratio (VR) schedule of reinforcement the delivery of reinforcement will “vary” but must average out at a specific number.
- ex. the reinforcement you may receive by playing a slot machine in a casino
Fixed-Interval Schedule (FI)
A fixed-interval schedule means that reinforcement becomes available after a specific period of time.
Variable-Interval Schedule (VI)
The variable-interval (VI) schedule of reinforcement means the time periods that must pass before reinforcement becomes available will “vary” but must average out at a specific time interval.
the slope of a variable ratio schedule’s will look like…
may look like a diagonal line with no pauses between
- Reflects the average # of responses required before reinforcement is delivered
An example of Fixed-Interval Schedule (FI)
a course with weekly quizzes
- this will mean that study behaviour responses will start
ramping up just before the quiz
Fixed-Interval Schedule (FI) will look like…
following reinforcement, there is a lull period in which
responding drops then slowly starts picking up again and peaking just before the next reinforcement is scheduled to be delivered following a response
Variable-Interval Schedule (VI) will look like…
is shown as a straight line on the cumulative record