Learning Theory Flashcards
Ivan Petrovich Pavlov
Classical conditioning
You can condition the response to a stimulus.
Taking a previously neutral stimulus (bell) and creating a conditioned response (salivation) by association with an unconditioned stimulus (food). Bell is rung before the food is presented.
John B. Watson
Founder of Behaviorism, based on Pavlov’s work
All behavior is just conditioned responses.
Watson suggested that animal behavior, including human behavior, is primarily the result of conditioned responses, or in simpler terms, behavior tends to be based on responding to a given stimulus – just like Pavlov’s dogs responded to the stimulus of the bell or
the presence of food.
He terrorized Baby Albert to create fear in response to a previously neutral stimulus (white rat).
Edward L. Thorndike
Law of Effect—basis of operant conditioning
Learning by trial and error. The association between stimulus and response is a connection. A consequence strengthens or weakens this connection.
If doing something makes a good thing happen, I am more likely to repeat it. If a bad thing happens, I probably won’t do it again or as often.
Burrhus Frederic Skinner
Operant Conditioning
Making the Action more or less likely to occur depending on whether the Consequence was good or bad.
Proposed that behavior is controlled by a stimulus immediately followed by an action and a consequence. Introduced the term reinforcement.
Behavior which is reinforced tends to be repeated (stronger connection). Behavior which is not reinforced will tend to die out (weaker connection).
David Premack
Premack Principle, or relativity theory of reinforcement
An animal will do something they DON’T like to do so that they get to do something else that they DO like to do.
Form of operant conditioning.
Enjoyable behaviors are “higher probability.”
Unenjoyable behaviors are “lower probability.”
Reinforcing a lower probability target behavior by awarding the animal with the opportunity to engage in a more desirable, higher probability behavior.
Examples: Wait before walk. Veggies before dessert. Study before playtime.
Classical conditioning
Taking a previously neutral stimulus (bell) and creating a conditioned response (salivation) by association with an unconditioned stimulus (food). Bell is rung before the food is presented.
A learning process that occurs when two stimuli are repeatedly paired: a response which is at first elicited by the second stimulus is eventually elicited by the first stimulus alone.
Food > salivate (unconditioned response)
Food + bell > salivate
Bell > salivate (conditioned response)
Can happen intentionally, yet also happens organically in daily activities, or in single traumatic events.
Classical conditioning involves an involuntary response. (Scared a critter? “Classic.”)
aka Pavlovian Conditioning and Associative Learning
In classical conditioning, that response is an involuntary or automatic response. It can also create an emotional response.
Example: thunder while in car, doesn’t want to get in car anymore
Stages of classical conditioning
Simple:
before: there is a new stimulus (doorbell)
during: pairing the new stimulus (doorbell) with the old stimulus (treats)
after: the new thing (doorbell) and the involuntary response are paired
Now the dog hears a doorbell and looks to their person for a treat.
- Before conditioning (no learning has happened)
a. unconditioned stimulus (US) prompts
b. unconditioned response (UR) is a natural, reflexive response - During conditioning—learning
a. neutral stimulus (NS) happens before
b. unconditioned stimulus
c. NS becomes conditioned stimulus (CS) - After conditioning—association established
a. CS prompts conditioned response (CR)
b. the behavior which was the original UR is now a CR, as it is now happening in response to a CS - Later, a novel neutral stimulus can be paired with a conditioned stimulus to cause a conditioned response to a secondary conditioned stimulus
Second-order conditioning
aka higher-order conditioning
A neutral stimulus (NS) is paired with a conditioned stimulus (CS), creating a secondary conditioned response (CR) without direct involvement of the unconditioned stimulus (US).
The dog hears the bag of food rustling (CS) and comes running (CR) for dinner (US). Later, they hear the pantry door opening (NS), which is followed by the bag rustling (CS). Eventually, the pantry door makes them come running for dinner.
CC: Acquisition
When a neutral stimulus (NS) becomes a conditioned stimulus (CS).
The conditioned response (CR) behavior increases as the association is strengthened.
Acquisition is the learning bit—it’s when the new stimulus (bell) starts to mean something.
CC: Extinction
The conditioned response (CR) decreases as the association between the conditioned stimulus (CS) and the unconditioned stimulus (US) is weakened.
“Unlearning” the conditioned response (CR) (salivating) by consistently presenting the conditioned stimulus (CS) (bell) without the unconditioned stimulus (US) (food).
Spontaneous recovery
The re-emergence of a previously extinguished conditioned response (CR) following a delay.
Extinguished behavior (Patti pulling out) She did great for 2 months randomly pulled out at a coffee shop (spontaneous recovery)
Classical Counterconditioning
- Change to response to stimuli
- a new conditioned emotional response
Changing Ember emotional response to someone one new coming in the room. Stand at her crate and give her roasted chicken until the person is in the room and Embers body is relaxed.
Be sure to keep the animal under threshold of fear throughout the process.
classical conditioning Stage 1
No learning is takimg place yet.
US produces UCR
Uncondotiond Response is a natural reflexive response.
Classical conditioning stage 2
Aquisition Stage
A Neutral stimulus is paired with an unconditioned stimulus.
The neutral stimulus becomes conditioned stimulus in this stage
Classical conditioning stage 3
Is after learning.
The conditioned stimulus (CS) elisits the conditioned response. (CR)
CC: Desensitization
Is the process of very gradually exposing the dog to a scary stimulus, ensuring he stays within the threshold where he will not react or show signs of fear or stress. This is a planned out process ensuring the dog remains calm and neutral every step of the way
Operant conditioning (OC)
Founded on Thorndike.
Animal learns to associate a behavior with a consequence.
Skinner conducted experiments to study how behaviors are strengthened which is called reinforcement, or weekend which is called punishment.
A fundamental principal of operate conditioning is that a stimulus comes first followed by a response or a behavior and then a consequence.
ABC’s.
Antecedent
Behavior
Consequence
aka Instrumental Learning
The dog’s response is voluntary. They learn that there are consequences for their behavior.
Example: teaching a Sit, the dog makes an association between sitting and a treat. The learning is that the behavior of sitting results in the consequence of a reward. The dog has a degree of control in operant conditioning.
OC Reinforcement
Positive
Positive reinforcement is adding something that will increase the likelihood of that behavior happening again.
OC reinforcers
any stimulus that will help increase or strengthen a behavior
most common: food
OC Reinforcement Schedules:
Continuous Reinforcement Schedule (CRF)
Every correct response, a reward is given.
Intermittent Reinforcement schedule: Fixed Interval (FI)
A set and unchanging amount of time between rewards. This is the least productive and most susceptible to extinction. [BUT WHY?]
Intermittent reinforcement schedule: Variable Interval (VI)
Changing and unpredictable amount of time between rewards.
Intermittent reinforcement schedule: Fixed Ratio (FR) 
When a behavior is rewarded after a set or predictable number of responses.
Intermittent reinforcement schedule: Variable Ratio (VR)
When a behavior is rewarded after an unknown or unpredictable number of responses.
Operant conditioning:
Punishment
A consequence that weakens or decreases the likelihood of a behavioral response.
Positive punishment
Adding something, usually an aversive stimulus, to decrease the likelihood of a behavior. This is always the last resort in the humane hierarchy. Can cause increased fear, aggression, or anxiety.
Negative punishment
Removing something pleasant that the dog wants, in order to decrease a behavior.
examples: dog pulling on leash, stop moving in the direction they want to go
food being used as a lure, dog jumps up on handler, food is taken away
OC: Punishers
Any stimulus that will decrease or weaken the likelihood of a behavior.
OC: Extinction
Reinforcement for a behavior stops, and the learned behavior is no longer displayed.
OC: Prompting
Visual signals or physical assistance to elicit a desired behavior, rather than waiting for the learner to spontaneously offer it.
Important to fade any prompt ASAP once the behavior is reliably offered.
OC: Prompting
Lure
Using food to draw the dog’s nose to follow, encouraging the desired behavior.
Generally creates a hand signal through the consistent movement.
OC: Physical Prompting
Physically guiding or touching the learner to help them use the target behavior.
Example: using a leash to guide a dog, or gently touching their body.
OC: Visual Prompting
A visible signal to encourage a behavior, such as the hand gesture previously used in luring.
OC: Unintentional Prompting
Any action by the handler that often precedes a behavior and becomes a prompt.
Examples: nodding the head, leaning or twisting the body
OC: Fading the Lure
Used to prevent the dog from becoming
dependent on the lure. Once the behavior is consistently offered, use a different prompt ASAP.
OC: Body Blocking
Using your body to block the dog, preventing them
from going to a particular place.
When a dog is familiar with body blocking, you may be able to stop them by leaning as though you are going to block them.
OC: Shaping
Rewarding successive approximations for complex behaviors. The behavior is broken down into its component steps, and you reward each step that brings the learner closer to the target behavior.
As they progress, only reward behavior that more closely resembles the final behavior, until you can reward only the complete desired behavior.
Metaphor: Record video of the final behavior. Each frame is one step closer to the end behavior.
OC: Chaining
A series (“chain”) of behaviors in which each behavior becomes the cue to perform the next behavior. Reinforcement only occurs after the final behavior.
May also refer specifically to forward chaining, in which the first behavior in the sequence is taught first, progressing toward the last behavior.
OC: Stimulus Control
Discrimination
The learner only offers a behavior in response to a specific stimulus.
Examples: sits for “Sit,” does not sit for “Down”
“Go To Bed” for a dog bed, “To Your Mat” for a rug, “To Your Place” to enter a crate
OC: Stimulus Control
Generalization
Similar stimuli elicits a similar behavior, so a learned behavior can be performed in different situations.
examples: different location, different handler, in a distracting environment
OC: Reinforcement
Negative
Taking away something away from the dog in order to increase behavior.
“The dog makes the bad thing go away.”
example: pulling up on leash until a dog sits, letting go when they sit
Primary reinforcers
Generally anything that is biologically important to the survival of an animal, such as food, water, sleep, touch, pleasure, access to mates, and pooping.
aka unconditioned reinforcers
Secondary reinforcers
Something that is paired with a primary stimulus. These are not important for survival, but are conditioned to have value.
example: clicker
aka conditioned reinforcer, marker, bridge
variable reinforcement
At some point in the training process, you’ll need to vary the reinforcers you use in order to keep the dog motivated.
Intermittent Reinforcement Schedule
Not every correct response receives a reward.
OC: Back Chaining
The last behavior in the chain is taught first, so it has the strongest reinforcement history.
Preferred by many trainers, and most dogs learn faster with back chaining than forward chaining.
OC: generalization
Generalization is where a stimuli elicits a similar behavior response.
When a dog learns a behavior in one situation, they are able to perform the same behavior in different situations.
Example: Under
Conditioning
Both classical and operant conditioning involve learning by making association between a stimulus and a response.
Classical vs Operant differences
Look at the dog’s behavior. The behavior is involuntary or emotional in classical conditioning. The stimulus comes before the response. Learning through association.
In operant conditioning, the behavior is a conscious, voluntary response. The event or consequence that drives the behavior comes after the response. Behavior on cue in anticipation of receiving a treat. If I do X, the result will be Y.
Counterconditioning (CC): Classical vs. Operant
CCC modifies the dog’s emotional response.
OCC also teaches them to perform a voluntary behavior.
CC: Classical Positive Conditioned Emotional Response (+CER)
CCC is usually combined with desensitization. They are used together with the goal of creating a +CER.
CC: Operant: Alternate Behavior
aka incompatible behavior
A behavior that replaces the unwanted behavior.
Example: barking and lunging at other dogs
AB: eye contact and touching hand
CC: Common classically conditioned stimuli
give examples
Desensitization: Form of non-associative learning
Through gradual exposure, the dog “gets used to it” without using food.
Desensitization: Gradual exposure
Very gradually, the strength of the stimulus is increased mindfully so the dog stays under threshold until they learn to ignore it at full strength.
Desensitization: Animal learns to ignore stimulus
99 cards of theory on the wall
Desensitization: DS/CC
Unpleasant stimulus is presented, and the dog is rewarded for noticing it. Gradually moved closer as long as dog doesn’t react. Learning to associate good things with the formerly unpleasant stimulus.