Chapter 5 Flashcards
Instrumental Conditioning: Foundations
Define Instrumental behaviour:
Behaviour that
occurs because it was previously needed
for producing certain consequences
Define instrumental conditioning
A form of learning in which behaviour is modified by
administering rewards and/or punishments
Behaviourist School of Thought (4):
– Skinner + Watson
– Human behaviour is shaped primarily by their
environment
– Learning is a product of reinforcement and punishment
– We are born as blank slates
What is the “equation” for instrumental conditioning?
Voluntary Response/ Behaviour
(ex: biting ones nails)
+
consequence (punishment)
=
Increase or
decrease in voluntary response (ex: no more biting of nails)
What is the “equation” for classical conditioning:
stimulus + stimulus = conditioned reflexive response
In instrumental conditioning, voluntary responses are __
modified
Give the basic procedure of instrumental conditioning (Step 1 , Step 2, and consequence)
Step 1:The organism ‘reacts or behaves’
Step 2: A behaviour modification technique is applied
Consequence:The reaction or behaviour either
occurs more frequently or is reduced/stopped
Instrumental conditioning can be used to:
produce complex behaviours
instrumental conditioning is a type of learning in which the _
consequences of behaviour tend to modify
that behaviour in the future
definition instrumental behaviour:
Behaviour that
occurs because it was previously needed
for producing certain consequences
Instrumental conditioning:
procedures developed to study instrumental behaviour
instrumental conditioning rationale, behaviour that is rewarded or reinforced tends to be:
repeated
instrumental conditioning rationale: Behaviour that is ignored or punished is _
less likely to be repeated
Edward L. Thorndike
The first serious theoretical analysis
of instrumental conditioning
thorndike studied instrumental conditioning using the
puzzle box
Thorndike’s Early Studies
Initially, a lot of behaviours are tried out
* Animal tracks outcomes of behaviours
– S -> R -> O
– In context (S), response (R) produces outcome (O)
* This knowledge guides future behaviours:
– Behaviours with positive outcomes increase
– Behaviours with negative outcomes decrease
Thorndike’s “Law of Effect”
If a response in the presence of a stimulus is followed by a satisfying event, association between
the stimulus (S) and the response (R) is strengthened
Conversely:
If a response is followed by an undesirable event, the
S-R association is weakened
Notes on Thorndike’s Law of effect: The resulting event is ___ of the association
not part
Notes on Thorndike’s Law of effect:The satisfying or annoying consequence serves to ____
Strengthen or weaken the S-R association
DEfine variables S-R-O
S=stimuli
R=Response
O= outcome
what are some methodological problems with thorndike’s puzzle boxes(5)?
(1) Have to repeat trials over and over, resetting animal and device
(2) cutoff: what is the worst performance?
(3) decreases with learning
(4) hard to compare across animals, trials
(5) How do you generate a prediction from latencies?m
what are the two types of procedures to study instrumental conditioning?
(1)Discrete-trial procedures (puzzle boxes + maze learning)
(2) Free operant procedures
What are the two types of discrete-trial procedures?
(1) Puzzle boxes
(2) Maze learning (T-Maze, 8-Arm Radial Maze)
What are the different types of mazes?
(1) Runway maze (aka straight-alley maze)
(2) T-maze
(3) 8-arm Radial Maze
8-arm Radial Maze
Often used for memory tasks
High off the ground, rats hesitant to walk off : makes learning more obvious when they DO walk off
free operant procedures in comparison to discrete trial allows
more dependant variables ; we can look at RATE of conditioning
The operant response is defined in terms of its:
effect on the environment
Different types of operant responses:
- Lever-press
- Chain pull
- Nose-poke
- Peck
What is the dependant variable in free operant procedures:
- Response rate
2.Total number fo responses
3.Latency to respond
BF skinner is considered to be:
the leading authority of IC
BF skinner was influenced by:
Thorndike
B.F skinner invented the “skinner box” to:
test IC through shaping
shaping reinforces
any movement in the direction of the desired response
shaping rewards:
gradual successive
approximations
Shaping is __ than waiting for the
response to occur and then
reinforcing it
quicker
Used effectively to condition humans
and many types of animals
Shaping
Shaping:
Shaping through successive
approximation builds a complex R
incrementally
Describe the steps in shaping (3):
- Initially, the contingency is
introduced for simple behaviour (R) - As the rate of R improves, the
contingency is moved to a more
complex version of R - Gradually, it builds a complex R
animal that would never be
“spontaneously” produced
Chaining:
Chaining builds complex R
sequences by linking together
S–>R–>O conditions
Describe chaining process (3):
- Initially, train the animal to pick up an object
- Next, reward it for picking it up and then throwing it
Chaining allows:
A series of behaviours
(as opposed to shaping, which simply elaborates on a single response)
Shaping and chaining can be used together to:
Train animals to complete incredibly complex behaviours
Shaping and chaining cannot:
move too fast
Shaping involves
combining familiar response
components into a new activity
shaping depends on:
inherent response variability
How To Get a Rat To Lever Press: Shaping:
- Magazine/food port training
Food is available here!
- Shaping
– Define the final response
– Identify the starting point of the behaviour
– Divide the progression from starting point to final point into a of steps – training plan
– Reinforcing successive approximations of the final behavioural response (and non-reinforcement of earlier response forms)
In the skinner box, the animal is:
free in the chamber, no experimenter intervention: Free operant learning
-> free operant learning!
Positive reinforcement
Press lever (R) –>Get food
Negative punishment
- Press lever (R) -> Food stops
Negative reinforcement
Press lever (R) -> End shock
Positive punishment
- Press lever (R) –> Get shocked
Describe IC in the skinner box:
(1):Initially, tries
many things;
eventually,
accidentally
presses the
lever, produces
a positive
effect
(2)Now starts
hanging
around the
lever,
accidentally
presses it again
(3)Rat has
learned a
contingency: if
light on (S),
pressing lever
(R) –> food (O);
spends much
of its day
pressing and
eating
Basic Pattern of IC: Pre-training:
Low spontaneous rate of R
Basic Pattern of IC:Training
Contingency is introduced:
* If S, R->O
Basic Pattern of IC: Acquisition:
-Animal discovers contingency
– Rate of R increases
Basic Pattern of IC:Extinction
Contingency is eliminated
R–> __
Rate of R decreases
Basic Pattern of IC: R has a __ initial rate
R has a LOW initial
rate
Animal must discover the
contingency
__ occurs in IC
generalization
Generalization:
Responding to other, similar
stimuli
example of generalization:
Pigeons respond to
different colours of disks The less similar the colour, the lower the pecking rate
Discrimination:
learning to distinguish
between a stimulus that has been reinforced and others that may be similar
Instrumental Conditioning: Influencing Factors (5):
- ‘Quality’ of the outcome (appetitive stimulus/ aversive stimulus)
2.Relationship between the instrumental behaviour and the outcome (positive or negative contingency)
- Magnitude of reinforcement
- Immediacy of reinforcement
- Level of motivation
Influencing factor: What are the two different “quality of the outcome” factors:
(1) Appetitive stimulus: “pleasant” event or outcome in the context of instrumental conditioning
(2) Aversive stimulus:‘unpleasant’ event or outcome
in the context of instrumental conditioning
define appetitive stimulus:
‘pleasant’ event or outcome
in the context of instrumental conditioning
Define “aversive stimulus”:
‘unpleasant’ event or outcome
in the context of instrumental conditioning
Influencing factors: What are the two relationships between the instrumental behaviour and the outcome::
(1) Positive contingency
(2) Negative contingency
Define positive contingency:
The instrumental response
causes an outcome/stimulus to APPEAR
Define negative contingency:
The instrumental response
causes a stimulus to DISSAPEAR or be ELIMINATED
Influencing factors: 3. Magnitude of reinforcement: As magnitude increases (3):
(1) Acquisition of a response is faster
(2) Rate of responding is higher
(3) Resistance to extinction is greater
ex: people work harder for $30/hr than $10/hr
Influencing factors: 4. Immediacy of reinforcement: describe two points or “rules”
(1)If reinforcement is immediate, responses are
conditioned more effectively
(2)As a rule, the longer the delay in reinforcement,
the more slowly the response will be acquired
Influencing factors: 5. Level of motivation: describe:
Higher motivation leads
to faster learning
Changes in instrumental behaviour are determined by:
The nature of the outcome, and whether or not the outcome is presented or eliminated
Define reinforcement:
Where the relationship between the response (R) and the outcome (O) INCREASES the probability of the response occuring
Define punishment :
Where the relationship between the response (R) and the outcome (O) DECREASES the probability of a response occurring
Reinforcement:
Anything that STRENGTHENS a response (or increases the probability that the response will occur)
What are primary reinforcers?
Primary reinforcers fulfill basic physical needs for survival
Primary reinforcers do not depend on :
learning