Lecture 10: Operant Conditioning - Elements & Schedules Flashcards
What is the fundamental component of operant behaviour?
- Response is controlled by its consequences
What are the 3 elements of operant behaviour?
- Instrumental response
- Outcome of the response (the reinforcer)
- Relation/contingency b/n response and outcome (i.e. how they’re linked)
What does reinforcement usually lead to?
- Increased uniformity of response pattern
What is stereotypy?
- Uniformity of behaviour
- Short repeated segments of behaviour (eventually occurs b/c of brain mechanism)
How can response variability be increased?
- If the instrumental reinforcement procedure requires variable behaviour
- “reinforcement of variability”, ex. L-R alternations
- In the absence of explicit reinforcement of variability, responding becomes increasingly stereotyped with continued conditioning (ex. drawing triangles)
How is the belongingness of the response relevant to operant conditioning?
- Response - Reinforcer
- Certain responses naturally ‘belong with’ the reinforcer b/c of the animal’s evolutionary history
- Ex. Puzzle boxes: did not use yawning/scratching as they are not naturally linked to release, but operating latch/pulling chain are manipulatory responses naturally related to release
- What makes sense to the animals?
What is instinctive drift?
- Extra, interfering responses that develop during reinforcement training that are naturally-occurring behaviours related to the reinforcer
- Ex. food reinforcer can push toward the development/interference of food-related responses
- Can be quite strong and interfere with responses required by training procedures
- Ex. raccoons will not put coin in slot, as instinct is to hold on
What is the Behaviour Systems Theory?
- When a particular system is activated, associated behaviours are engaged
- Includes both reinforced behaviours and those that are associated with instinctive drift (opposing motivations)
What happens when hamsters must dig or face-wash in order to receive a food reward? Why?
- Digging is much easier to reinforce, as it is natural to behaviour system
- Responses that become more likely when the animal is hungry are more readily reinforced with food
Is the quantity of the reinforcer relevant in operant conditioning?
- Yes
- Rats run faster to receive more of the same reinforcer
- How much effort is it worth?
- Motivation is a factor
Is the quality of the reinforcer relevant in operant conditioning?
- Yes
- Dogs run faster to get more palatable reinforcers (sausage > dry food)
What happens when the quality of the reinforcer is shifted in operant conditioning?
- Response changes
- Ex. When a dog is expecting sausage and gets dry food, responding decreases drastically
- Response increases again when the dog is again given sausage
- Acting on expectation of previous outcome (learning history)
What did Crespi’s experiment find?
- Responding to a particular reward also depends on animal’s past experiences with other reinforcers
What is negative behavioural contrast?
- When reinforcer decreases in previously-experienced quantity, response speed for reinforcer decreases
- Ex. dropping number of pellets given => drop in response speed below original lowest responding
- What you’re used to affects the change in behaviour
What is positive behavioural contrast?
- When reinforcer increases in previously-experienced quantity/quality, response speed for reinforcer increases
What is response-reinforcer contiguity?
- Delivery of reinforcer immediately after response
- A form of temporal relation
What is response-reinforcer contingency?
- Extent to which response is necessary and sufficient for occurrence of reinforcer
- Relates to causal relation
How are temporal and causal factors related to each other in operant conditioning?
- They are independent of each other
What effect on instrumental learning does delay of reinforcement have?
- Instrumental learning is disrupted by delaying the reinforcer after the response
- Ex. Occurrence of response/delivery of reinforcer was delayed
- Increase in delay decreases response rate
- With increasing delays, difficult to assign causality
Which response is ‘credited’ with reinforcement after delay?
- Both temporal and causal factors are present
- Response-reinforcer relation requires temporal contiguity
What is accidental/adventitious reinforcement?
- Instrumental learning can develop as a result of this
- Ex. No contingency, pigeon had food delivered every X seconds; superstitious behaviour emerged; claimed temporal contiguity drove learning
What are terminal responses?
- Occurred more often at end of food-food interval
- Ex. Pigeons orienting to food magazine, pecking at magazine
What are interim responses?
- Occurred more often near the middle of the food-food interval
- Ex. Pigeons pecking at floor
What influences the acquisition of instrumental behaviour?
- Contiguity
- Contingency
What are ratio schedules?
- A response is reinforced only once a certain number of responses is made
What are interval schedules?
- A response is reinforced only if it occurs after a certain amount of time has passed
What is a fixed schedule?
- Same number of responses/specific amount of time
What is a variable schedule?
- Number of responses/time varies
What is a fixed ratio schedule?
- Fixed number of responses lead to reinforcement (ex. FR10)
- Steady and moderate rate of responding with brief, predictable pauses
What are post-reinforcement pauses, and in what schedule are they found?
- Zero responding just after reinforcement on FR
- Fixed Ratio
What is the ratio run and in what schedule of reinforcement is it found?
- High and steady responding that completes each ratio requirement
- Fixed Ratio
What is ratio strain and in what schedule of reinforcement is it found?
- Zero responding after ratio requirement is increased too quickly (e.g. FR1 is switched to FR50)
- Shock after experience
- Switch gradually to prevent this
What is a cumulative record?
- Way to represent how a response is repeated over time
- Shows the total number of responses that occurred up to a particular point in time
What is the simplest schedule of reinforcement?
- Fixed Ratio 1
- AKA continuous reinforcement
What is the Variable Ratio schedule of reinforcement? What is the rate of responding like?
- Number of responses to obtain reinforcer varies from one reinforcement to the next (e.g. 10 responses for 1st, 13 for 2nd, 7 for 3rd, etc)
- The average number of responses per reinforcer is what the VR schedule is labelled (e.g. VR10)
- Steady rate of responding because response number is less predictable
- Ex. scratch tickets, lottery
What is a progressive ratio schedule of reinforcement?
- Gradual increase in ratio
- A type of variable ratio schedule
What is the fixed-interval schedule?
- Amount of time that has to pass is constant from trial to trial (e.g. FI15)
- Predictable
- Performance reflects accuracy in timing (improves with training)
- Low levels of responding
What is the fixed-interval scallop?
- Slower response rate immediately after reinforcement (NOT the same as FR post-reinforcement pause)
- Gradually increasing rate across interval
What is the variable-interval schedule?
- Amount of time that has to pass is not constant from trial to trial, but varies, settling on an average for the session
- Will define range
- Maintain steady and stable rates of responding without pauses, similar to VR schedules
- Slower than VR
When does a pause in responding happen?
- With a fixed schedule, where there is predictability
What does the steepness of the cumulative record mean?
- Rate of responding
What schedule of reinforcement leads to the highest rate of responding?
- Variable Ratio
Which schedule of reinforcement leads to the lowest rate of responding?
- Fixed interval
What is another term for omission-training procedures?
- Differential reinforcement of other behaviour (DRO)
- Highlights that individual receives appetitive stimulus if they are engaged in behaviour other than response specified by procedure
Who studied behavioural contrast effects?
- Crespi
What is temporal contiguity?
- Delivery of reinforcer immediately after response
What is response-reinforcer contingency?
- Causal relation
- Extent to which instrumental response is necessary and sufficient to produce reinforcer
What is a marking procedure?
- Facilitates learning with delayed reinforcement
- Mark target instrumental response in some way to make it distinguishable from other activities of the organism
- Can be accomplished by introducing brief light/noise after target response, or moving animal to holding box for delay interval
What is the learned-helplessness effect?
- Effects of exposure to uncontrollable shock on subsequent escape-avoidance learning in dogs
- Found that exposure to uncontrollable shock disrupted subsequent learning
What is the triadic design used in studies of learned-helplessness effect?
Different exposures = different results
- Escapable shock => rapid-avoidance learning
- Yoked inescapable shock => slow-avoidance learning
- Restricted to apparatus => rapid-avoidance learning
What is the learned-helplessness hypothesis?
- Assumes that during exposure to uncontrollable shocks, animals learn that shocks are independent of behaviour
- Come to expect that reinforcers will continue to be independent of behaviour in the future
- Undermines ability to learn a new instrumental response
- Expectation of lack of control reduces motivation to perform instrumental response and even if they make the response and get reinforced, the previously learned expectation of lack of control makes it more difficult to learn that new behaviour is effective
What is the difference between the learned-helplessness hypothesis and effect?
- Effect = pattern of results obtained with triadic design (disruption of instrumental conditioning caused by prior exposure to inescapable shock)
- Hypothesis = explanation/interpretation of the effect
What is the attention deficit hypothesis?
- Exposure to inescapable shock reduces extent to which animals pay attention to own behaviour (=> learning deficit)
What are shock-cessation feedback cues?
- Some of response-produced stimuli are experienced at start of escape response, just before shock is turned off
What are safety-signal feedback cues?
- Other response-produced stimuli are experienced as the animal completes the response, just after shock has been turned off at start of intertrial interval
What is a schedule of reinforcement?
- Program or rule that determines which occurrence of a response is followed by the reinforcer
What is partial/intermittent reinforcement?
- Situations in which responding is reinforced only some of the time
What is the length of the post-reinforcement pause controlled by?
- Upcoming ratio requirement
What is limited hold?
- Restriction on how long a reinforcer remains available
- Added to FI or VI schedules