Behaviour Modification (Operant Conditioning) Flashcards
What is operant conditioning
Process where a response is more frequent as a function of the consequence it produces
What is a respondent
A behaviour that is elicited by a specific stimuli
How do we influence the rate of occurrence of a behaviour
Manipulating the consequence it produces
What are the three steps of operant learning
antecedent -> behaviour (R) -> food reward
Another term for operant learning
Instrumental learning
How do you teach more complex behaviours
String together Rs
Touch wand -> touch hoop -> jump through hoop
What is a consequence that increases the frequency of behaviour called
Reinforcer
What is a consequence that decreases the frequency of behaviour called
punisher
Describe positive reinforcement, negative reinforcement, positive punishment and negative punishment in terms of consequence and change in response
PR = add consequence to increase behaviour
NR = remove consequence to increase behaviour
PP = add consequence to decrease behaviour
NP = remove consequence to decrease behaviour
When is a bridging stimulus needed
During positive reinforcement
What is a bridging stimulus, what kind of temporal paradigm are they?
Bridges the response and the delivery of the reinforcement
Trace temporal paradigm (anticipatory)
When does the bridging stimulus become a secondary reinforcer
When it takes on the properties of the reinforcement (treat) via classical conditioning
A positive reinforcer is…
A reinforcer that produces an increase in the frequency of the desired behaviour (e.g. food)
A negative reinforcer is…
reinforcer that strengthens a behaviour by removing what is aversive
e.g. choke collar
What is negative reinforcement
Negative reinforcer is removed, leading to termination of pain/reduction of fear
e.g. person scares dog, they growl, person leaves
What is punishment
Present an aversive stimulus or remove a pleasurable stimulus after an undesirable behaviour has occurred, which reduces behaviour in future
What is positive vs negative punishment
Positive = presentation of aversive stimulus
Negative = removal of pleasurable stimulus
How does the success of punishment compare to that of reinforcement?
Temporary effects
Behaviour that is suppressed might be “saved up” and appear again
Types of positive punishment
- Interactive punishment: animal associates unpleasant stimulus with person, behaves differently when person is around, fear-related problems may arise, aggression
- Remote punishment: association between punishing stimulus and person is removed (shock collars)
Example of negative punishment
Social punishment (remove social interaction; timeout)
What does successive approximation help us do
Modify existing behaviour or create new one
What is successive approximation
Trial and error learning
Animal’s natural curiosity
How does successive approximation work
Reinforce dogs movement in the right direction
Any increment that approximates the goal becomes the new threshold (what is needed for the reward). does not have to be significant change
Leads to extinction of previous behaviour that no longer meets the threshold
Selective reinforcement
What can we do with a behaviour once it is shaped
Chain it
What was skinner’s rat trick
Rat pulls string releasing a marble
Pick up the marble with paws, carry it to a tube
Drop it in the tube, receive a pellet
What is a heterogeneous chain
Chain involving multiple types of behaviours
A chain that involves only one type of behaviour is…. e.g.?
Homogenous chain
e.g. 1 press, 2 press, 3 press
Why are schedules of reinforcement helpful
Produce reliable patterns of behaviour that can be maintained
Two different schedules to provide reinforcement based on…
Time (interval)
Response (ratio)
What is a fixed interval schedule? example
Reinforces the first response after a set period of time (1 min, 1 min, 1min)
e.g. buses
What is a variable interval schedule? example?
Reinforcement for the response occuring after some average period of time (VI 1min = reinforce at 10s, 5m, 30s = average 1 min)
e.g. trying to get through on a busy phone line
What is a fixed ratio schedule?
Given # of responses made before reinforcement is delivered (FR 20 = 20 responses required)
What follows the reinforcement in a fixed ratio schedule
A pause, then high steady state of responding
What is a variable ratio schedule? Ex?
Reinforcement after varying # of responses, requiring an average (VR 25 = some 2, 100 = average of 25 responses)
e.g. slot machines, phone apps
How is a behaviour maintained? What happens if it is not
Occasional reinforcement
if not, extinction
Difference between extinction and forgetting
Extinction = active process
Forgetting = passive process
Best schedule for behaviour maintenance?
Variable ratio
What is the main limitation of maintenance?
If it goes against the animals natural behaviours
Instinctive drift