Unit 4: Operant Conditioning Flashcards
What’s the difference between instrumental and classical conditioning?
classical: presence or absence of stimulus causes response
operant: behaviour causes presence or absence of stimulus (consequence)
Who did the study of instrumental conditioning start with and what was he interested in?
Edward Thorndike
interested in animal intelligence
How were Thorndike’s experiments generally structured?
hungry animals placed in puzzle boxes
food outside of boxes but in view
-> animals had to learn how to escape the box to obtain food
How did the animal’s behaviours change in the puzzle box?
initially unable to escape
slow to make right response
continued to practice until latencies become shorter
How did the animals learn how to solve the puzzle box?
trial and error to discover behaviour required to escape
successful behaviours retained
useless behaviours eliminated
How did Thorndike label the animals ability to learn how to escape a puzzle box?
animal intelligence
Why is animal intelligence not an accurate term?
many behaviours seem unintelligent
initial presence of various responses typical for confined animal, with some leading to a desirable result
consequences reinforce the action
->cat doesnt understand how levers work, but presses it because it thinks it will be rewarded for it
Law of Effect
If R in presence of S is followed by positive event -> becomes strengthened
If it isn’t followed by a positive event -> S-R association becomes weakened
What do we measure in discrete-trial procedures?
rat runs down a maze to get reward
measures response latency (time it takes for the rat to leave the start of the box) and running speed (how fast it reaches the end)
What’s a T maze trial?
type of discrete-trial procedure
allows us to measure percentage of correct choices
What are trials?
specific periods of time during which the animal can show instrumental responses
set by the experimenter
Why didn’t Skinner use discrete-trial procedures?
behaviour is continuous (one leads to the next)
-> trials more natural if animals aren’t removed
behaviour can be broken down into measurable units: operants
What is magazine training?
US paired with CS via classical conditioning
sound elicits sign-tracking response
How does response shaping work? (example: rat in Skinner box)
after magazine training, rat can learn operant response
1. food given if rat goes on hind legs anywhere in chamber
2. food given if rat leans over lever
3. food given if rat goes up on hind legs and presses lever
=> sequence called shaping/ reinforcement of successive approximations
What are operant responses in free-operant procedures measured as?
rates
Response rate
frequency of instrumental behaviours occurring
high: high probability of behaviour occurring
low: low probability of behaviour occurring
How can we differentiate outcomes?
appetitive vs aversive
positive vs negative
Which components do all instrumental conditioning procedures involve?
instrumental response
outcome (reinforcement, punishment)
stimulus
association between response and outcome
positive reinforcement
behaviour produces (adds) appetitive outcome
negative reinforcement
behaviour produces absence of aversive stimulus
positive punishment
behaviour produces aversive outcome
negative punishment
behaviour produces absence of appetitive outcome
Can a behaviour always be reinforced (by anything)?
no, only if behaviour is naturally linked to reinforcement
e.g. cant reinforce yawning in cats with opening box, because yawning isn’t naturally linked with release from confinement
What does the presence of a stimulus activate?
behaviour system related to that stimulus
e.g. hunger (S) causes hamsters to start digging and scrabbling (behavioural system linked to hunger), while stopping self-care behaviour (behaviour doesn’t address hunger)
What does instrumental conditioning depend on with regards to the reinforcement?
quality and quantity of reinforcement
nature of reinforcement
previous reinforcements for same instrumental behaviour
Behavioural contrast effect
big reward perceived as especially good after small reward and vice versa
Which types of relationships between response and reinforcement are there?
temporal relationship: contiguity
causal relationship: contingency
Are temporal and causal factors dependent on each other?
no
What can we say about temporal relations?
immediate reinforcement is preferable to delayed reinforcement
credit assignment
if too much time passes, we won’t be able to link specific behaviours to the reinforcement
(credit assignment is the reason for this)
What’s more important to create associations, contingency or contiguity?
contiguity
The fact that a behaviour occurred just before the reinforcement was more important than whether it caused the reinforcement. What is this kind of reinforcement called?
adventitious/ accidental reinforcement
learned helplessness
when experiencing a tense state repeatedly
-> feeling of being incapable to change the situation
Why does the reinforcer not occur after every response in instrumental conditioning procedures?
reflects nature of real world
What’s a schedule of reinforcement?
rule that determines how and when a reinforcer follows a response
Ratio schedules
reinforcer occurs after X amount of responses
continuous reinforcement schedules
reinforcement delivered after every response
commonly used in drug abuse treatments
Is continuous reinforcement common in real life?
no
Token economies
reward system where tokens can be exchanged for bigger rewards
used to reduce disruptive behaviours
Partial/ Intermittent reinforcement
reinforcement only occurs sometimes
fixed-ratio schedules
number of reinforcers received per number of responses is fixed
Are continuous reinforcement schedules a type of fixed-ratio reinforcements?
yes
How strong is the rate of responding generated by continuous reinforcement?
steady and moderate
What is steady responding PRECEDED by? (in fixed-ratio schedules)
a brief pause
How/ where do we see rates of responding?
in cumulative records
-> show total number of responses during a period of time
How do cumulative recordings work?
pen rests on piece of paper
moves up vertically after each response
-> time between responses (horizontal distance)
-> slope: rate of responding (number of responses per unit of time operants)
What does an increase in fixed-ratio requirements tend to cause?
increase in post-reinforcement pause despite no change in rate of responding during ratio run
What’s ratio strain and what causes it?
caused by dramatic increases in fixed-ratio requirements
-> periodic pauses during ratio run
-> in extreme cases responses stop completely
Variable-ratio schedules
number of responses required to achieve reinforcement varies
Do variable-ratio schedules cause post-reinforcement pauses?
no, they are less likely to cause pauses
Why don’t variable-ratio schedules cause post-reinforcement pauses?
subject doesn’t know how many responses are required for reinforcement
-> maintains stable rate of responding
Is fixed or variable-ratio reinforcement more effective for long term effects?
variable-ratio
What’s the difference between variable reinforcement and intermittent reinforcement?
Intermittent reinforcement: broad category, includes variable and fixed ratio/ intervall reinforcement, rewards only given sometimes
variable reinforcement: reinforcement given after unpredictable amount of times, can’t be fixed number of responses/ time
What can be said about the effects (rate of responding and resistence) of variable and intermittent reinforcement?
variable: higher rate of responding, less resistent to extinction
intermittent: lower rate of responding, more resistent to extinction
What’s an interval schedule?
responses are only reinforced if the response occurs after a certain amount of time
Fixed-interval schedules
amount of time that has to pass before response is reinforced is constant
Do fixed interval schedules always lead to the reinforcement being delivered after a certain amount of time?
no, the interval only determines when a reward is available, not when it’s delivered
-> response still has to occur
Variable interval schedule
amount of time that passes between a reference point and response that produces a reward varies
Do fixed interval schedules also lead to post-reinforcement pauses (and a higher response rate before the delivery of the next reinforcement)?
yes
Do variable-ratio and interval schedules cause the same effects?
no, variable-ratio leads to higher response rate while variable-interval leads to more steady response rate
Why are the different effects of ratio and interval schedules relevant for us?
implications human behavior (motivation)
What happens in concurrent schedule trials?
different responses can be associated with different reinforcement on different schedules of reinforcement
(your responses in a situation are rewarded at different rates depending on the behavior, so you have to choose which behavior to carry out)
How do we measure choice behaviour?
assessing how responses are distributed across response alternatives
How do you mathematically measure response rates of concurrent schedules?
Response A (total left responses) / Total responses (A&B / left & right)
How do you mathematically measure the rate of reinforcement (concurrent schedules)?
Total reinforcements for option A / Total reinforcements (A&B)
What does the matching law describe?
rate of responding on alternatives matches rate of reinforcement (40% of all reinforcements come from A -> responds to A 40% of the time)
What are the two perspectives, with which you can answer the question as to what motivates instrumental behavior?
Associative structure of instrumental conditioning (molecular perspective): how stimuli, responses and outcomes are related
Response-allocation approach (molar perspective): how instrumental behavior relates to long-term goals
Does instrumental responding only involve the response and a reinforcement? If no, what’s missing?
No
the environment
What are the three events of instrumental responding?
Contextual Stimuli
instrumental response
response outcome (only weakens/ strengthens response)
What are habits and how much of our behavior do they constitute?
things we do automatically
45% (estimate)
Two-process theory
two types of learning: Pavlovian and instrumental
S-O association established through classical conditioning activates reward expectancy/ emotional state
-> depending on nature of emotional state (appetitive/ aversive): motivates instrumental behavior
What does the associative analysis of instrumental conditioning view behavior as?
result of associations between stimuli, responses and outcomes
What is instrumental behavior motivated by? (context: organism’s behavior)
restricting an organism’s natural behavior (only 1 response leads to reward, rest doesn’t)
Consummatory-response theory
it isn’t the nature of the response that is reinforcing, but the behavior related to consuming the reward
What does the Premack principle state?
high-probability behaviors can be effective reinforcements for low-probability behaviors
How is a high probability of reinforcing behavior maintained in instrumental conditioning procedures?
restricting access to reinforcement
Response deprivation hypothesis
restricted access to reinforcement is critical for motivating instrumental responding
Instrumental contingency & behavioral bliss point
Quality of reward must reflect quality of response & vice versa
(time on facebook must equal time studying)
-> behavioral bliss point: perfect ratio
What does behavioral economics study?
what alternative reinforcements are available
how are alternatives related to reinforcement in question
costs of obtaining alternatives
-> understand what motivates instrumental responding
demand curve
how consumption of products is influenced by the price
elasticity of demand
how much consumption can vary in relation to increasing costs
high elasticity = high variability
Does elasticity of demand depend on the availability of other products?
yes
more options = higher elasticity
What else influences how much an increase in price affects consumption? (not availability of alternatives)
income
instrumental contingencies