Chapter 7: Motivation Flashcards
Two approaches: What motivates instrumental behaviour?
- Associative structure of instrumental conditioning
• Pavlov
• Thorndike
• Narrow approach - Behavioural regulation
• Skinner
• Broad approach
Three-term Contingency
- More than just a response and reinforcer:
- Stimulus context (S)
- Instrumental response (R)
- Response outcome (O)
- Associative structure: S-R-O
• The three terms (S, R and O) allow for many different types of associations to be formed (what is it that the animal is learning? Why does it perform this behaviour? Is it because it sees the S, or because it is thinking about the O?)
S-R association
- Thorndike thought that the S-R association was key to instrumental conditioning
- Reinforcers merely work to develop the association between S and R
- Implications: No learning about the reinforcer (O) or R-O
- This idea was mostly abandoned in the 1960’s
- However, currently used to describe human habits, including drug addiction
- e.g., Do you sit in the same seat each time you come to class?
looking for a general outcome (I.e. going to class looking for a better grade) but the particulars are done without thinking (i.e. where you sit in class)
S-O Association and Expectancy
• Expectancy of reward (O) forms when (S) present
• Pavlovian conditioning may be involved in instrumental learning via S-O pairings
• Instrumental response motivated by two factors:
1. S-R association
2. S-O association
• Precursor to modern two-process theory
Two-Process Theory
Rescorla & Solomon, 1967
Assumes two distinct types of learning:
• Pavlovian Conditioning
• Instrumental Conditioning
S-O association activates an emotional state which motivates instrumental behaviour (R)
• Type of emotional state depends on types of reinforcer (O)
How do you test this idea….
• Pavlovian expectancy motivates instrumental
behaviour
• Presenting a Pavlovian CS should alter
instrumental behaviour
• Test with a transfer-of-control experiment
Pavlovian Instrumental Transfer Test
• Three steps
1. Instrumental conditioning*
2. Pavlovian conditioning*
3. Instrumental responding with presentations of Pavlovian CS
• Predictions
• If Pavlovian S-O motivates, then responding should increase when CS is presented
tests two process theory
• Positive and negative emotions can be elicited by CS
• Positive emotions increase instrumental responding
• Negative emotions decrease instrumental responding
• However, The CS can elicit more than just positive or negative emotional states
- The CS can elicit reward-specific expectancies
Transfer of Control Experiment
Lovibond, 1983
Trained rabbits in an operant task
Followed this by classical conditioning
• Exp. 1 & 2, CS+
• Exp. 3, CS+ and CS-
Tested the effects of the CSs on operant responding
rabbits pressed head against lever to get food
trained with particular stimulus turning on to get food (had no access to lever at this point)
trained with CS+ and CS- (two diff emotional states)
now if we present rabbit with the lever, will they response more if presence with the CS+ at the same time
Effects of Excitatory CS on Operant Responding
• Instrumental responding increased when CS+ was presented in the experimental group (paired)
• Instrumental responding stayed the same in the unpaired control group
Effects of an Inhibitory CS on Operant Responding
• Instrumental responding increased when CS+ was presented in the experimental group (paired)
• Instrumental responding did NOT decrease with CS- presentation
• BUT, presentation of CS+ and CS- simultaneous also did NOT increase instrumental responding
Reward specific expectancies – Transfer of control with 2 CSs
- Pavlovian training with 2 CS’s
- One CS is for food
- Another CS is for water
- Test both CS (independently) in Instrumental transfer test of either food or water outcome
IF positive emotional state
• Increased responding to both CS’s (when signaling one or the other)
IF reward specific
• Increased responding for CS that matches current instrumental reinforcer
Two-process theory ignores ______ __________
- Two-process theory ignores R-O associations
- This is counterintuitive considering how humans verbally describe their behaviour
- e.g., I am doing this (R) to get this (O)
How to test for R-O associations?
How to test for R-O association? • Devalue the O • Then test for R • remember US devaluation? (Tested for S-R versus S-S learning)
O devaluation • Rats reinforced for pushing a rod to the left or right • One direction = food • Other direction = sugar water • After training, devalue one reinforcer via LiCl injections • Test preference
IF S-O association:
• decrease in both responses because devaluation had effected the
properties of S (the lever, the chamber, etc.)
IF R-O association:
• One response (the one that results in the devalued outcome) to be
lower, suggesting that separate R-O associations were formed
• Difficult to explain different responses to different stimuli
IF S-R association:
• No change in responding as model does not include the reinforcer (O)
S(R-O) Hierarchical relations
- R-O associations can’t act alone to produce instrumental behaviour
- Response activates an expectancy of a reinforcer, but this does not tell us why the response happens in the first place
- R-O association is activated in the presence of S
- Remember, Law of Effect suggests that S activates R, and this association is “stamped in” by O
- Skinner’s (1938) three term contingency
Drive Reduction Theory
Hull (1943): Events are reinforcing if they reduce a physiological drive
• Obtain water if thirsty
• Acquire conversation if lonely
Establishing operation: Making a rat hungry so it will work for food
Drive Reduction Theory is not…
Not a comprehensive theory of reinforcement
• Rats will press a lever to obtain saccharin
• People will pay money for ‘frivolous’ goods
• Too many reinforcers that neither reduce
drives nor are associated with primary
reinforcers
Consummatory-Response Theory
- Reinforcement is caused by access to species-specific unconditioned responses (e.g., chewing and swallowing)
- Reinforcers as responses, not stimuli
- However, many CSs have appetitive value even if they do not always have consummatory responses
Relative Value Theory
•With total freedom, different behaviours have
different probabilities of occurring
• e.g., watching Netflix → high probability, studying →low probability
•Premack Principle: H = high probability response L = low probability response •L→H, reinforces L •H→L, does not reinforce H
How to test Premack Principle
- Establish baseline responding of animals for
different behaviours - Instrumental conditioning procedure with:
• L→H
• H→L
• Implications: Any high probability response
can serve as a reinforcer for a lower probability response
Premack Principle & Children
free-eating and pinball playing
Any high probability response can serve as a reinforcer for a lower probability response
Applications of Premack Principle
Applications of Premack Principle
• Clinical patients
• Find out what behaviour is reinforcing (high probability of occurring) for each individual
• Delayed Echolalia vs. Perseverative behaviours in children with autism
• Notion that reinforcers are responses, and not
stimuli
Premack Principle and BMod
• Data from 2 children with autism
• Individual preferred behaviour was a better
reinforcer than food
Any activity…
…could be a reinforcer…
… if it is more likely (“preferred”) than the operant response.
To determine what reinforcers to use for an individual, look at what they do
Response-Deprivation hypothesis (Disequilibrium Model)
Behaviour is reinforcing when the individual is prevented from engaging in the behaviour at its normal frequency
• a sated rat will not perform an operant response for food
Premack realized restricting access was important this, but it was not central to his theory
- Deviations from Premack Principle: L probability can reinforce H probability if L behaviour is restricted to below-baseline levels
- Reinforcer produced by operant contingency itself
- Implications: Any behaviour can be reinforcing if access to that behaviour is restricted
Response-Deprivation hypothesis (Disequilibrium Model)
Premack (1962)
rat study
water deprived rat
- Water is H
- running is L
- rat will run to drink
Unlimited water
- Water is L
- running is H
- rat will drink to run
Homeostasis of Behaviour Distribution
• Assumption that all animals have a preferred
distribution of activities
Behavioural Bliss Point
• Organisms thought to have a preferred distribution of behaviour (Bliss Point)
Behavioural regulation focuses on how response-reinforcer contingencies force animals away from ___________. How?
their bliss point
- Measure behaviours performed
• w/ no constraints - Place constraints on behavioural allocation
• Impose instrumental contingency
Slater & Wood (1977)
Average of 12, 30-min cycles of a male zebra finch
• 1. Measure behaviours in unconstrained
situation
2. Place constraints on behavioural allocation
• Puts pressure on Bliss Point
• Act to defend challenges to Bliss Point
- But requirements of contingency (may) make achieving Bliss Point impossible
• Compromise required
• Redistribute responses so as to get as close to Bliss Point as possible
Minimum-deviation
Minimum-deviation
Time spent playing video games (min) vs. Time spent studying (min)
Organism will redistribute its behaviour to minimize the deviations of the two behaviours from the bliss point
Reinforcement effect
- Increase performance of the instrumental response from baseline responding
- e.g., more time spent studying
• Results from behavioural-regulatory mechanisms that function to minimize deviations from the bliss point
However….
• There may be many other behaviour that the animal may engage in
• Alternate sources of reinforcement
• e.g., texting or Netflix
Behavioural Economics
• How do organisms organize their behaviour
based on the rules of the system?
• Microeconomic choices are
strongly driven by cost, which is an instrumental contingency
• Elasticity of demand
Behavioural Economics
- Elasticity determinants
Determinants of elasticity: • Competitors / Substitutes • Range of pricing • Income • Complementary Commodities
Response-Allocation in General
• Behaviour systems are complex and
organisms array their behaviour to produce maximal benefit
• Reinforcement and punishment direct changes in this allocation
Behaviour is reinforced by the opportunity to
engage in _______ ________ _______
Behaviour is reinforced by the opportunity to
engage in other, more desirable behaviours
we can create a reinforcer by…
We can “create” a reinforcer by restricting access to it (reorganize behaviour)
We can gauge “desirability” by…
We can gauge “desirability” by noting baseline
activity levels
Are Rewards Bad? (Paul Chance) (7)
- Whenever possible use intangible rewards. Don’t provide a toy if a compliment will do.
- Avoid using rewards as incentives. Instead of saying, “If you do X, I’ll give you Y,” wait for the behavior (or for something approximating the behavior) and then provide the reward without having promised it.
- Reward at a high rate in the early stages of learning, and gradually reduce the frequency of rewards as learning progresses.
- Reward only behavior that you want repeated. If you don’t want a child to whine and complain, don’t provide rewards to quiet her when she whines and complains.
- Remember that what is an effective reward for one person may not be effective for another. Praise, recognition, approval, status, awards, and free time are usually rewarding, but not everyone marches to the same drummer.
- Reward success, and set standards so that success is achievable.
- Bring attention to the intrinsic rewards the activity itself offers. Point out, for example, the fun to be had from word play in poetry, or the fun of discovering something from a hands-on experiment.