Animal Learning Flashcards
Animal Learning modules, "He Said, She Said, Science Says" by Dr. Friedman
Foundation of all training—classes of conditioning
immediate consequences (OC) and associations/patterns between events (CC)
Real Work of Dog Trainers
practical understanding
- Improve the dog’s behavior.
- Bring owner’s expectations into a realistic range.
- Find the sweet spot in the middle.
Intervention Categories
- Management of behavior
- Training and behavior modification
- Normalizing, education, empathy building
- Exercise, diet, mental stimulation
Operant Conditioning
Supplying immediate consequences contingent on particular operant behaviors you want to change.
“Dogs do what works.”
Learner’s choice is inherent to OC.
One of the most studied phenomena in the history of psychology, and quite possibly THE biggest goldmine for dog trainers.
Alternate OC term
Instrumental Conditioning
Classical Conditioning
CC
Learned association between events—anticipating an event when another reliably predicts it.
CS predicts UCS, resulting in CR.
Affects emotions.
Tip-offs about what will happen next. Behavior has no effect on outcome.
Alternate CC terms
- Pavlovian Conditioning
- Respondent Conditioning
Edward Lee Thorndike
coined term
Law of Effect
Define
Law of Effect
Behavior is a function of its consequences.
Animals adjust their behavior depending on the effects it achieves.
Edward Lee Thorndike
John Watson
coined term
Behaviorism
Define
Behaviorism
Behavior—rather than internal events—should be the stuff of psychology.
B. F. Skinner
coined terms and major focus
- Operant Conditioning
- Reinforcement
- Punishment
- Reinforcement schedules
How R and P affect the frequency of behavior.
“Stay out of the black box.”
Reminder from BF Skinner not to try to get in the animals head—R and P are strictly defined by their effect on behavior.
This is ABA.
What is R or P is not always intuitive—focus on the change in behavior.
Ivan Pavlov
coined term
Classical Conditioning
First question in training
Watershed decision—tops the Technique Choice Flow Chart
Is this dog upset?
Examples of “upset”
emotions
- fearful
- anxious
- worried
- stressed
- uncomfortable
- shutdown
Does not include amped up or excited.
Technique Choice Flow Chart
Systematic guide for which training technique to use based on actual circumstances.
Training a comfortable dog
Dog is not upset
Manipulate consequences using OC.
Technique Choice Flow Chart
Training an upset dog
CC—+CER
- Change the underlying emotional response
- Learn that whatever is upsetting as safe or even good
- Ends the motivation to hide, bark, growl, behave aggresively, etc.
Define:
CER
Conditioned Emotional Response
Define
Conditioned Emotional Response
how, + and -
- CER procedure—CC to change emotional response
- Counterconditioning
- Side effect of R+ in OC and DRI
- +CER—happy anticipation
- -CER—fear or anxiety
i.e. teaching a dog to like being body-handled
CER Execution Rules
Critical to success of CER! Must follow the rules to a T.
- Correct order of events
- CS occurs or starts before US
- 1:1 ratio of CS:US
- CS without US is an extinction trial
- Weaken competing CSs via extinction trials
Single CER trials at random times if possible
Very mildly upset dog
i.e. leary of new chrome garbage can
Habituation
Define
Habituation
Give examples (trashcan and mild fear of vacuum)
Passive CC through exposure. Decreased anxiety to a stimulus over time—does not predict anything.
i.e. no action around trashcan, or leaving the vacuum on for a long time until it gets old
Mildly upset dog
i.e. afraid of vacuum
- Classical counterconditioning procedure
- Habituation
- both
Can also use DRI for +CER side effect.
Define
Counterconditioning
mild vacuum example
Countering an existing emotional response with +CER.
i.e. cheese for +CER with a running vacuum
Moderately to intensely upset dog
i.e. severely afraid of vacuum
- Desensitization and counterconditioning
- Suggest med consult with vet
Define and steps (vacuum example)
Desensitization
Breaking down counterconditioning into smaller, easier steps the dog can handle.
Gradual increase in intensity of unpleasant stimulus needed for moderate to severe emotional response.
For severe fear of vacuum
- Lie down vacuum while off
- Strong R+ like cheese at a fairly comfortable distance
- Gradually decrease distance
- Put the vacuum in the upright position
- Repeat distance reduction as above
- Turn vacuum on
- Repeat distance reduction as above
Technique Choice Flow Chart
First question if using OC
Goal to increase or decrease behavior?
- increase desired behaviors
- i.e. sit, down, stay, recall, etc.
- decrease unwanted/problem behaviors
- i.e. barking, chewing furniture, eliminating in the house, play biting, etc.
Elicit untrained behavior
why and how
Create opportunity to reinforce
- prompting
- shaping
- capturing
Reinforcement
Any consequence which increases or maintains the frequency of a behavior.
Reinforceables
We need behaviors to happen so we can reinforce them.
Methods of decreasing a behavior
- DRI
- Punishment
- both
P- only—nothing scary or violent
Acronym
DRI
Differential Reinforcement of an Incompatible behavior
Define, side effect, and examples
Differential Reinforcement of Incompatible behavior
One option to decrease an unwanted behavior.
* Develop an alternative [mutually exclusive] behavior.
* “Do this instead of that.”
* i.e. Sit for greeting means they can’t jump
- R+ training has a +CER side effect
- i.e. training to jump over the vacuum teaches a trick and creates +CER to the vacuum itself
Punishment
Any consequence which decreases the frequency of a behavior.
Define in general
Contingency
If/then relationship—one thing depends on another happening.
Training depends on the learner noticing the contingency—both in CC & OC
Define and complete the examples
OC contingency
examples (crate barking, raiding trash, taking aspirin, touching hot stove)
Behavior-consequence
If this behavior, then that consequence.
- owner lets dog out of crate for barking
- dog eats tasty food for raiding the trash
- headache goes away after taking aspirin
- burned by touching hot stove
Connection between behavior (acting on the environment) and consequence.
Pattern over time— B -> C -> modified B
Define and give examples
CC contingency
Order of events
If event X happens, then event Y follows.
* Behavior has no effect on the contingency
* Owner picks up briefcase, then dog is left alone for 6 hours
* Fed at the same time everyday
* Car rides predict trip to the dog park
* Car rides predict scary vet visits
* Different routes predict different destinations
aka Association between events
Discrimination Learning
Strength in dogs! Recognizing fine discriminations between similar events.
“When is it worthwhile to spend behavioral dollars?”
i.e. route to vet vs. route to dog park
Define
OC Quadrants
Classes of consequences defined by their method and effects on behavior.
* method—adding or removing stimulus
* effect on behavior—increased/maintained or decreased
Can only be identified after the behavior change is observed.
They define the four corresponding kinds of OC.
Often but not always intuitive—intention does not equal effect on behavior.
Summarize
“Behavior doesn’t just flow like a fountain. Behavior is a tool animals use to produce consequences.”
quotation by Dr. Susan Friedman
No motivation, no training.
Reinforcement and punishment are the natural effects of consequences—do more of what works, and less of what doesn’t.
All behavior has costs, and needs an offsetting benefit to be worthwhile.
Behavior
as defined by Dr. Susan Friedman
A tool animals use to produce consequences.
Dogs do what works! Trainer’s job is to identify & employ motivators
Various motivations for dogs
Basic needs and comfort
- Avoid pain and extreme temperatures
- Food
- Water
- Preferred resting surfaces like beds or sofas
Prey Drive and Instincts
- Critters running away
- Toys that simulate critters (ball, frisbee, tug)
- Interesting smells
- Walks
Social Behaviors
- Being with someone the dog is bonded to
- Praise, patting, and attention
- Play opportunities
- Other dogs
Varies by dog and by time for each dog.
Food as a motivator
Works on all animals.
Give examples
Play Opportunities
- tug
- fetch
- rough housing
- dog-dog play
Trainer’s job
in OC
Identify current motivators and make them contingent on desired behaviors.
Manipulating consequences to change behavior.
Define and give examples
Operant
- A class of behavior.
- i.e. sitting, barking, pawing, urinating, nose-touching, etc.
-
Operating on the environment to produce certain immediate consequences.
- Animals use operant behaviors on the environment to see what works (R as consequence) and what doesn’t (P as consequence).
Response
A single repetition of a behavior.
If your dog sits, that’s one repetition of the operant “sitting.”
Define and give examples
R+
Positive Reinforcement
Addition of a motivator as consequence of target behavior.
Anything given that increases or maintains a behavior.
Good stuff happens or starts.
Intuitive examples: treat, door opening, play with toy, access to bed or sofa, patting
Define and give examples
R-
Negative Reinforcement, aka relief
Termination of ongoing punishment in response to target behavior.
Anything taken away that increases or maintains a behavior.
Bad stuff stops or goes away, aka relief from P+.
Intuitive examples: stopping shock, ear pinch, or collar tightening
Define and give examples
P+
Positive Punishment
Addition of a punisher as the consequence of unwanted behavior.
Anything added that decreases the frequency of a behavior.
Bad stuff starts or happens.
Intuitive examples: hurting or scaring the dog by yelling, striking, rolling or pinning, shocking, tightening prong collar, leash corrections/jerks, shake cans, spray collars
Define and give examples
P-
Negative Punishment
Removal of a motivator or end of an enjoyable activity as the consequence of unwanted behavior.
Anything terminated that decreases the frequency of a behavior.
Nothing scary or violent is necessary for highly motivating punishment!
Good stuff stops or goes away—timeout ends freedom and R+ opportunities.
Intuitive examples: timeout, toy put away, playmate disengages, no food reward, canceling a game or training session
The quadrants free of adversives
i.e. deal with good stuff
R+ and P-
Acronym
ABA
Applied Behavior Analysis
Define
Applied Behavior Analysis
Detailed definition from “He Said, She Said, Science Says” by Dr. Friedman
The implementation of behavior principles and methods to solve practical behavior problems by carefully arranging antecedents [and consequences].
About the actual effect on behavior, not the intention of the trainer.
More broadly, ABA is the modern use of OC in applied settings.
Intention
in OC/ABA
Irrelevant.
R & P are defined by the effect on behavior (increase/maintain or decrease).
Give examples
Adult human P-
Take away time and money.
- parking tickets
- fines, taxes, loss of income
- ice cream falling on the sidewalk
- a boring meeting
- a penalty in sports (10 yards away from goal in football
Chicken Camp P-
visual discrimination task
- trained to peck target of particular color or shape
- 2 minute training sessions
- consequence of pecking an incorrect target is removal of the correct one for 20-30 seconds
- loss of opportunity for R+
- high magnitude
- major change in behavior (decrease in pecking incorrect target) after just a few P-
Learning
From “He Said, She Said, Science Says” by Dr. Friedman
Behavior change due to experience [in OC and CC?]
Stimulus
Anything an animal can perceive—visual, auditory, tactile, olfactory, or taste
Compound stimulus
Multiple stimuli occuring simultaneously
CS
Conditioned Stimulus
“Here it comes!”
Always starts before US
Novel CS
First experience with a stimulus is a more powerful conditioner
Major opportunities for +CER, moreso for puppies
CS pre-exposure effect
Prior experience with a CS creates a learned response, slowing a CER
US or UCS
Unconditioned Stimulus
The “it” in, “Here it comes!”
The final event in the stimulus chain
Increase US potency
regarding CER
Rarity—pair a particular high magnitude reward 1:1 with the target CS for increased CER effect
CR
Conditioned Response
The response obtained after CC
UCR
Unconditioned Response
Natural response to US/UCS (i.e. salivating)
Also applies to pre-CC response to CS
Offset training
end of CS/US
End of a CS predicts the end of a US
“There it goes!” or “closing the bar”
Example—+CER for dogs being around, and reinforcers end when the dogs leave
Temporal conditioning
Dogs are excellent at learning and estimating repeated time intervals, for better and for worse.
Be careful not to create an accidental CS!
Behavior chain
Sequence of behaviors
Define
Behavior analysis
From “He Said, She Said, Science Says” by Dr. Friedman
The science of behavior change that
studies functional relations between behavior and environmental events.
ABC
aka functional assessment/analysis
as taught by Dr. Susan Friedman
Antecedent-Behavior-Consequence contingency
Smallest unit in OC—used to analyze the behaviors we want to understand, predict and change.
ID target B, then isolate & control immediate A and C to change behavior
“Bleeding”
regarding CER training
CS with duration starts before and overlaps the US
Advantageous but not required for CER training
Backwards conditioning
Counterproductive CC/CER—US before target CS
Reduces CS potency
Don’t do it!
Simultaneous conditioning
Counterproductive CC/CER—presenting CS with US at the same time as a compound stimulus
Competing CS
real world training
In the stimulus-rich, messy real world, your CS is always part of a compound stimulus
Examples—time interval, putting on bait pouch, reaching for treats, bag crinkle, smell of food, praise
Overshadow
CS ignored in favor of intrinsically salient stimuli—smells, noticeable touch
Block
A CS with established CR out-competing the new CS—reaching for treat pouch or pocket, bag crinkle
Time-shift
competing CSs
Delay competing CSs until after the target CS
needs an example
Positive
Adding or intiating a stimulus as a consequence
Regardless of whether it is reinforcement or punishment.
Negative
Removing, terminating, or subtracting a stimulus as a consequence.
Regardless of whether it is reinforcement or punishment.
Consequence
C
The result which is contingent on a particular behavior
OC is the manipulation of consequences resulting in a change in behavior
Define
Aversive stimulus
aka aversives
Anything painful or scary.
If it’s ongoing, the beginning is P+ and end is R-.
Cost of aversive stimulus
quotation from “He Said, She Said, Science Says” by Dr. Friedman
“People should view forceful and coercive training methods as stealing behavior that can be given to us instead by skillful use of positive reinforcement and facilitative antecedents.”
Magnitude
Instensity or severity of a stimulus.
higher magnitude = more motivating
Applies to both reinforcers and punishers.
Force-free
Training without the use of aversive stimuli.
R+ and P- only.
Define, use in training, list formal antecedents
Antecedent
The stimuli, events, and conditions that occur immediately before a behavior.
In training, indicates when a behavior will be reinforced, increasing likelihood.
What works when.
B is a function of C in the presence of A
Two categories of immediate/formal antecedents—prompts and cues
Define and examples
Inadvertent (passive) antecedent
Clues to the dog for when something might “work” that we don’t notice or intend.
- getting ready to leave
- certain person is present
- TV on in the evening
Consequence
The stimuli, events, or conditions that immediately follow a behavior, influencing future frequency
Define and examples
Prompt
Antecedent that works naturally to elicit a behavior—even untrained dogs tend to respond
- food lure
- crouching down
- enticing/high-pitched noises or clapping
- enticing creature or object
- moving away quickly
Define, examples, and when to add
Cue
Signal which elicits a behavior that only acquires meaning through training.
“Stylized” antecedent—no natural tendency toward a behavior
verbal cue, hand signal, or other trained cue
Only install on robust terminal behavior! (What and why then when.)
Use of Prompting
Coaching or manufacturing a behavior so it can be reinforced.
Showing the “what” behavior.
Capturing
Reinforcing a behavior when a dog happens to do it.
Shaping
Rewarding the closest approximation to a target behavior, gradually increasing criteria until the final behavior is achieved.
Prompt or capture each approximation to develop the new behavior
Luring
Orienting prompt used to guide into a desired behavior.
Fading
Gradual elimination of a prompt after the target behavior is strong
Stimulus control—removing one antecedent in favor of another
Purpose of Fading
Removing a prompt while still getting the behavior.
Sneaky key to fading a food lure
dog’s side
The dog has to have faith that reinforcement will come for following an empty-handed signal.
Define and how to respond
“Literal” dogs
Won’t perform a behavior without a lure.
Use splits toward hand signal.
Bury the lure, pay from the other hand, gesture higher/faster
Latency
Lag time between an antecedent and a behavior
Usually faster with repeated reinforcement—B as a consequence of C ASAP
Infinity latency
Infinite time to perform a behavior after an antecedent—freeze and wait.
Dog still needs to be engaged. No repeating prompts or cues!
Define general and formal
Stimulus control
Attaching a behavior strongly to a cue.
Formal stimulus control means always performing the behavior on cue, never for any other cue, and never spontaneously “off cue”
Practical Stimulus Control
Training as far on each behavior as the owner needs.
Off cue, multiple antecedents (“Sit” or at the door), or guessing wrong
Fifty Buck Bet
Reminder to have a high level of confidence that a behavior will occur before adding a cue before the prompt.
quote from Gary Wilkes
Define contingency and steps
Cue installation
Creating CC between new antecedent (cue or Ac) and a known prompt (Ap)
Must be sequential! Ac -> Ap -> B -> C
Prompt Jumping
After a cue, the dog performs the B without waiting for the prompt
Response cost—
Shopping for verbal
what, when, and side effect/remedy
Reliably prompt jumping, so higher criteria. No behavior on verbal? Incentivize the jump.
- Use the prompt
- Lower magnitude R+
- Praise but no treat, etc.
- Conditioned punisher (NRM) and mini timeout
- Brief dead period in training session
- No opportunity for R+
- Lower magnitude R+
May crash RoR—can alternate cost/no-cost to stay engaged.
Define
Antecedent intervention
in behavior problems
Reduce or eliminate stimuli that precedes problematic behavior
i.e. physical barrier to prevent seeing passersby to head off barking
Reinforcement Schedule
How often correct responses will be reinforced (and when they will not).
Different schedules have different uses and effects on behavior.
Continuous reinforcement schedule
aka CRF or FR1
Every correct response is paid
Best schedule when building a new behavior
Intermittent schedule
aka intermittent ratio schedule
Correct responses are sometimes paid
Variable schedules best maintain behavior
Casinos use intermittent R+ to keep people playing.
Fixed ratio schedule
aka FR# (i.e. FR3)
Reinforcement at a fixed ratio, such as FR3—every third correct response is paid
More reliable (resilient to extinction) B than continuous schedule
Caution for Fixed Ratio Schedules
If the R is consistently too far apart, the dog might decide it isn’t worthwhile.
Define and key benefit
Variable ratio schedule
aka VR# (i.e. VR5)
Paying a set ratio of correct responses but for a variable trial, such as VR3—correct behavior is reinforced every third time on average (any 10 out of 30)
Most resistant to extinction (behavior survives longer without R)
Interval schedule
Reinforcement based on duration of an ongoing behavior.
- Fixed interval—every X seconds
- Variable interval—every Y seconds on average
- Still structured—not at random
For behaviors without discete instances, such as duration down-stay
Applications for intermittent schedules
- Building duration
- Resistance to extinction
- Increasing variability
Extinction
OC and CC
OC—Unreinforced behavior decreases
CC—CS without US weakens conditioning
Breaking down contingencies to reverse conditioning. No longer “works.”
Extinction trial
Each occurence of a CS not followed by the US
Matching Law
Expensive behaviors need high value rewards often enough.
Animals invest their behavior to maximize known reinforcment schedule and magnitude.
They learn patterns and optimize for what works. We use this to get the behavior we want, and shake it up to keep getting what we want.
How we use the Matching Law
- Behaviors we want must pay well enough
- Unwanted behavior must not payoff
Learner decides what is reinforcement—not intention!
Superstitious learning
Attachment of non-contingent (coincidental) reinforcement to a behavior by the learner, causing an unintended behavioral increase.
Imagining an OC contingency that isn’t there—adding something unnecessary along with the target behavior.
Self-reinforcing—behavior increases, so coincidences become more likely
Establishing Operation
Reduce or eliminate a freely given motivator to establish it as a more potent reinforcer.
Closing the economy. “Nothing in life is free.”
Closed Economy
100% of a motivator is earned through training, creating the strongest possible motivation.
Open Economy
When even a small percentage of a motivator is given freely.
Abolishing Operation
Decrease the potency of a competing motivator through saturation.
Almost anything gets old if you get enough of it.
i.e. 5 minute play session at the beginning of a group class, and another midway through.
Premack’s Principle
Grandma’s Rule
Any high probability (preferred) behavior can be used to reinforce a lower probability (less preferred) behavior.
Motivation always possible by using “distractions” as reinforcers.
The high probability behavior of eating a snacko can reward Sit or Come.
Dunbar’s take: turning distractions into rewards
Primary Reinforcers
aka Primaries or Unconditioned Reinforcers
Intrinsically rewarding stimuli—food, play, attention, interesting smells
Conditioned Reinforcer
aka Secondary Reinforcer, Reward Mark, or Bridge
Bridges the gap between the exact desired behavior and the delivery of a primary reinforcer.
Only valuable in OC
How to install a conditioned reinforcer
Straight CC—charging the secondary reinforcer (i.e. clicker) by immediately following with a primary (like cheese) repeatedly
Define
Conditioning Trial
aka Pairing—give general and ideal procedures
Each click/reward instance in charging a secondary reinforcer.
- Vary time between pairs to keep each trial separate.
Ideal
- spread throughout the day
- pair with a variety of primaries
Anticipatory Behaviors
React to incoming primary
[update]
Feeding for position.
Conditioned Punishers
aka Secondary Punishers
Marks the moment of unwanted behavior that earns P-. Always followed by a primary punishment such as timeout.
“Too bad” or other marker. Can be charged on the fly as needed.
A ticket is punishing because it always precedes a primary punishment.
Warning cues
“Careful,” “gentle,” or similar to allow informed choice—repeating the behavior will result in P-
Puppy bites too hard—”careful.” Puppy bites just as hard again—P-!
Give ONCE and only once for efficacy.
“Safety” cue
Indicates correct choice after warning cue (i.e. thank you)
give an example
Punishment Schedules
Punishment must be used every time. Continuous punishment schedules.
Warning Cues
Combine duplicates
“Careful,” “don’t,” “easy,” or “gentle.”
Gives warning before a secondary punisher, allowing for an informed choice of next behavior.
Use ONCE.
P- client compliance
Go over importance of doing P- each and every time.
Human tendency to not follow through because it feels like a lot of work (Premackian broccoli!)—strict consistency makes P- highly efficacious
AL2 slide 27: mentions Timeouts: What to Expect handout
Punishment Magnitude
Increase motivation by matching a more undesirable behavior with a more expensive penalty.
Cancelling whole training session instead of delaying a few seconds
Taste Aversion Learning
Major exception to close timing between behavior and consequence needed for CC. Can be minutes or hours later, especially if a novel food.
Science
as explained by Dr. Susan Friedman in “He Said, She Said, Science Says”
A process of self-correction over time through peer-review and independent verification of findings—more valuable than conventional wisdom
May change later, but it provides the very best, most reliable info now.
Limits to B mod strategies
From “He Said, She Said, Science Says” by Dr. Friedman
“Behavior change strategies are limited only by our imagination and our commitment to using the most positive, least intrusive, effective strategies.”
Empowerment
Choice & control
From “He Said, She Said, Science Says” by Dr. Friedman
“…To the greatest extent possible all animals should be empowered to exercise personal control over significant environmental events.”
“…One part of what makes consequences reinforcing is the power to control one’s own outcomes.”
Study: babies with control over mobiles happier than babies without it
Learned helplessness and its impact
From “He Said, She Said, Science Says” by Dr. Friedman
“…A lack of control can have pathological effects including depression, learning disabilities, emotional problems …and suppressed immune system activity.”
Animals subjected to aversive stimuli without ability to escape will later remain passive in the presence of the stimuli even with ability to escape. Adverse effects of lack of control can be minimized “by providing them with experiences in which their behavior is effective.”
“Ripples in a pond”
by Dr. Susan Friedman
Behavior is like a stone thrown into a pond, with antecedents and consequences rippling out from it.
Antecedents ripple backwards in time, and consequences ripple forwards.
99% of the time we only need to understand the first ripple—the most immediate antecedents and consequences.
True or False?
Animal behavior is random.
False
Any healthy animal’s behavior is organized (OC).