Conditioning and Learning Flashcards
Two differences between negative reinforcement and punishment
1) Negative reinforcement encourages a subject to behave a certain way; punishment encourages a subject to STOP behaving in a certain way 2) Negative reinforcement entails REMOVING a negative event; punishment entails INTRODUCING a negative event
Twin studies
A behavior genetic research method that involves comparison of the similarity of identical (monozygotic; MZ) and fraternal (dizygotic; DZ) twins.
What are the four parts of Social Learning?
- Attention 2. Retention (involves memory) 3. Initiation – execution of the learned behaviour 4. Motivation – observer must want to engage in the observed behaviour
Chaining
Linking together series of behaviours that result in reinformcement
- Learning the alphabet; each letter stimulates remembering the next
Conditioned Response (CR)
A natural/instinctual response evoked by a Conditioned Stimulus. The CR is almost ALWAYS the same as the UR, and is contingent on a pairing event – i.e. drooling when we see a food chain logo is a CR, though drooling is elicited naturally when acually eating the food (UR).
Fixed interval schedule
Rewards come after a fixed passage of time (i.e., 5 minutes) rather than the number of behaviours - arguably does little to motivate behaviour since the animal gets the same reward napping for 5 minutes then presssing a lever once as it does if it were to press the lever continuously for 5 minutes
Yerkes-Dodson Effect

Law of Effect
Thorndike When a behaviour has a positive (satisfying) effect or consequence, it is likely to be repeated in the future and vice versa. *The bigger the reinforcer or punisher, the stronger the learning
Goal-directed behaviour
Instrumental behavior that is influenced by the animal’s knowledge of the association between the behavior and its consequence and the current value of the consequence. Sensitive to the reinforcer devaluation effect.
Variable ratio schedule
Reinforcements delivered after a DIFFERENT number of correct responses -Ratio cannot be predicted - Takes the most time to learn, but very difficult to extinguish -Slot machines are a prime example – Vegas was built on a variable ratio strategy
Critique of drive-reduction and other homeostasis theories
Individuals often engage in “exploratory behaviour” – seek out stimulation, novel experience or self-destruction
M.E. Olds
Experimented with brain stimulation of pleasure centers in brains of animals
- Animals would perform behaviours to receive stimulation
- Evidence AGAINST drive-reduction theories
n(Ach)
Need for achievement (Henry Murray & David McClelland) -a theory of motivation in which people need success or need to avoid failure
Quantitative Law of Effect
A mathematical rule that states that the effectiveness of a reinforcer at strengthening an operant response depends on the amount of reinforcement earned for all alternative behaviors. A reinforcer is less effective if there is a lot of reinforcement in the environment for other behaviors.
John Atkinson
People who set realistic goals with intermediate risk sets feel pride with accomplishment, and want to succeed more than they feel failure - Because success is so important, these people are unlikely to set unrealistic goals or to persist when success is unlikely
Taste aversion conditioning
The phenomenon in which a taste is paired with sickness (classical conditioning), and this causes the organism to reject—and dislike—that taste in the future.
Fixed ratio schedule
Reinforcement is delivered after a consistent number of responses - i.e. 6:1; after 6 responses, a reward will be given - Vulnerable to extinction
Renewal effect
Recovery of an extinguished response that occurs when the context is changed after extinction. Especially strong when the change of context involves return to the context in which conditioning originally occurred. Can occur after extinction in either classical or instrumental conditioning.
Preparedness
The idea that an organism’s evolutionary history can make it easy to learn a particular association. Because of preparedness, you are more likely to associate the taste of tequila, and not the circumstances surrounding drinking it, with getting sick. Similarly, humans are more likely to associate images of spiders and snakes than flowers and mushrooms with aversive outcomes like shocks.
Also called the GARCIA EFFECT (For John Garcia who studied this)
Differential reinforcement of successive approximations
Shaping in operant conditioning
Spontaneous recovery
Recovery of an extinguished response that occurs with the passage of time after extinction. Can occur after extinction in either classical or instrumental conditioning.
Theories that assert that humans are primarily motivated to maintain physiological or psychological homeostasis
- Fritz Heider’s Balance Theory 2. Charles Osgood and Percy Tannenbaum’s Congruity Theory 3. Leon Festinger’s Cognitive Dissonance Theory -All agree that people are driven to be balanced wrt feelings, ideas and behaviours
Instrumental/Operant Conditioning
A behaviour (rather than a stimulus) is associated with the occurrence of a significant event.
Blocking
In classical conditioning, the finding that no conditioning occurs to a stimulus if it is combined with a previously conditioned stimulus during conditioning trials. Suggests that information, surprise value, or prediction error is important in conditioning.
What is a main difference between classical and operant conditioning?
Operant conditioning is the result of voluntary actions: classical conditioning, on the other hand, depends on “involuntary” behaviour (i.e. drooling is involuntary, while lever pressing involves a decision to do so)
Victor Vroom
Applied expectance value theory to the workplace - found that individuals lowest on the totem pole do not expect to receive company incentives, so are not very motivated to perform
Shaping
Experimenter rewards rats in Skinner Box with food for being near the lever (to encourage lever-pressing behaviour) AKA “Differential reinforcement of successive approximations”
Skinner Box
A rat in a laboratory learns to press a lever to get food – since the rat has no “natural” association between pressing a lever and getting food, the rat must learn this behaviour. Lever-pressing is the “operant behaviour” and food pellets serve as “reinforcers”
Approach-avoidance conflict
Neil Miller - the state one feels when a certain goal has both pros and cons - the further from the goal, the more one focuses on the pros (and vice versa for cons)
E. L. Thorndike
Suggested the law of effect - organisms do what rewards them and stop doing what doesn’t bring rewards - precursor of operant conditioning - wrote the FIRST educational psychology textbook in 1903
Fear conditioning
A type of classical or Pavlovian conditioning in which the conditioned stimulus (CS) is associated with an aversive unconditioned stimulus (US), such as a foot shock. As a consequence of learning, the CS comes to evoke fear. The phenomenon is thought to be involved in the development of anxiety disorders in humans. The CS triggers an emotion (fear) rather than a behaviour (i.e. drooling)
Latent learning
Takes place without reinfocement
E.g. watching someone play chess many times – you might not know that you are learning, but realize you have learned some tricks later on when you decide to play
Operant behaviour
A behaviour that involves an organism “operating” on its environment – a rat will explore its cage and, at first, accidentally press a lever – later on it will realize that each time this behaviour is performed, he is rewarded with food. Finding a “shortcut” in MarioCart that reduces your overall time is another operant behaviour – receiving a shorter track completion time is your “reinforcer”
Negative reinforcement
NOT punishment Reinforcement through the removal of a negative event E.g. if a monkey were subjected to a blaring noise all the time except when it rode a tricycle, it would learn that riding the tricycle removes something negative (the noise)
Adoption Study
A behavior genetic research method that involves comparison of adopted children to their adoptive and biological parents.
Drive-reduction theory
Clark Hull - deviations from homeostasis create physiological needs. These needs result in psychological drive states that direct behavior to meet the need and, ultimately, bring the system back to homeostasis. When a physiological need is not satisfied, a negative state of tension is created; when the need is satisfied, the drive to satisfy that need is reduced and the organism returns to homeostasis. In this way, a drive can be thought of as an instinctual need that has the power to motivate behavior.
Unconditioned Response (UR)
An instinctual or natural response to a stimulus (does not require any training or teaching). Examples include jumping upon hearing a loud noise, salivating in response to the smell/sight of food, feeling grumpy when under-slept.
Reinforcer devaluation
The finding that an animal will stop performing an instrumental response that once led to a reinforcer if the reinforcer is separately made aversive or undesirable.
Partial reinforcement schedule
Not all correct responses are rewarded - Requires longer learning time, but is less prone to extinction than continuous reinforcement schedule
Stimulus control
When an operant behavior is controlled by a stimulus that precedes it Example: we turn left in response to a green ARROW, not just the green light alone; “go” is controlled by the arrow above the colour green itself. The “controller” stimulus is called the discriminative stimulus
Who founded Instrumental/Operant Conditioning?
Edward Thorndike & B.F. Skinner
Quantitative genetics
Scientific and mathematical methods for inferring genetic and environmental processes based on the degree of genetic and environmental similarity among organisms
Reinforcer
An effect that strengthens an organism’s desire to execute a specific behaviour – in the Skinner Box experiment, food pellets reinforce lever-pressing
Simultaneous Conditioning
US and CS are presented at the same time
Continuous vs. discrete motor tasks
Continuous (e.g. riding a bike) are easier to learn than discrete motor tasks
-Discrete tasks involve individual parts that do not facilitate recall of the others (e.g., setting up a chess board)
Overshadowing
Presence of two (compound) stimuli – animal learns an association via the more salient stimulus
Continuous reinforcement schedule
Every correct response is met with some form of reinforcement - facilitates the quickest learning, but is the most fragile learning –> as soon as the reward stops coming, the organism stops performing
Heritability coefficient
An easily misinterpreted statistical construct that purports to measure the role of genetics in the explanation of differences among individuals.
Kurt Lewin
Developed theory of association: precursor of operant conditioning - organisms associate certain behaviours with certain rewards and certain cues with certain situations
What is drug tolerance due to?
Conditioned compensatory response (CCR) As a result, ODs often happen not due to an increase in dosage, but due to taking the drug in a new place/environment without the familiar cues that cause CCR
Stimulus generalization
Make the same response to a group of similar stimuli
- Not all fire alarms sound alike but we know how to react to them
Classical (Pavlovian) Conditioning
The pairing of a neutral stimulus with a psychologically relevant event (i.e., a bell and food, fish and food poisoning)
What is learned in operant vs. classical conditioning?
Operant conditioning: a behaviour is associated with a significant event Classical conditioning: a STIMULUS is associated with a significant event
Prediction error
When the outcome of a conditioning trial is different from that which is predicted by the conditioned stimuli that are present on the trial (i.e., when the US is surprising). Prediction error is necessary to create Pavlovian conditioning (and associative learning generally). As learning occurs over repeated conditioning trials, the conditioned stimulus increasingly predicts the unconditioned stimulus, and prediction error declines. Conditioning works to correct or reduce prediction error.
B.F. Skinner
Proved experimentally that animals are influenced by reinforcement (skinner box) - Wrote “Walden Two” and “Beyond Freedom and Dignity” which discussed control of HUMAN behaviour
Habit
Instrumental behavior that occurs automatically in the presence of a stimulus and is no longer influenced by the animal’s knowledge of the value of the reinforcer. Insensitive to the reinforcer devaluation effect. Example: if a rat spends many months performing the lever-pressing behavior (turning such behavior into a habit), even when sucrose is again paired with illness, the rat will continue to press that lever. After all the practice, the instrumental response (pressing the lever) is no longer sensitive to reinforcer devaluation. The rat continues to respond automatically, regardless of the fact that the sucrose from this lever makes it sick.
4 types of partial reinforcement schedules
- Fixed ratio 2. Variable ratio 3. Fixed interval 4. Variable interval
Higher-order/second-order conditioning
A conditioning technique in which a previous CS now acts as a US EX. with pavlov’s dogs, bell could be used as a US after it reliably predicted food; bell could be paired with light until the light became the CS
Token economy
an artificial mini economy usually found in prisons, rehab centers, or mental hospitals; Individuals motivated by tokens aka *secondary reinforcers* (things with learned value) - desirable behaviours are reinforced with tokens, which can be cashed in for primary reinforcers like candy, books, cigarettes, privileges
Discriminative stimulus
In operant conditioning, a stimulus that signals whether the response will be reinforced. It is said to “set the occasion” for the operant response.
Premack principle
People are motivated to do what they DON’T want to do by rewarding themselves afterward with something the like i.e. giving a young child dessert after he eats his vegetables
Extinction
Decrease in the strength of a learned behavior that occurs when the conditioned stimulus is presented without the unconditioned stimulus (in classical conditioning) or when the behavior is no longer reinforced (in instrumental conditioning). The term describes both the procedure (the US or reinforcer is no longer presented) as well as the result of the procedure (the learned response declines). Behaviors that have been reduced in strength through extinction are said to be “extinguished.” NB extinction doesn’t “eliminate” a CR, it merely represses it
Clark Hull
Drive reduction theory - Performance = Drive x Habit - Individuals are first motivated by drive, then act according to past successful habits
Edward Tolman
Expectancy value theory - Performance = Expectation x Value - People are motivated by goals that they actually think they can achieve
John B. Watson
Founded school of behaviourism Believed everything could be explained by stimulus-response chains and that conditioning was the key factor in developing these chains
Social models
Authorities that are the targets for observation and who model behaviours
Unconditioned Stimulus (US)
A stimulus that produces a natural or instinctual reaction (i.e. food makes us salivate, loud noises startle us, hot showers produce pleasure)
Conditioned compensatory response
In classical conditioning, a conditioned response that opposes, rather than is the same as, the unconditioned response. It functions to reduce the strength of the unconditioned response. Often seen in conditioning when drugs are used as unconditioned stimuli. Ex. for one who regular takes opioids, drug paraphernalia (CS) can signal an increased sensitivity to pain, because the body anticipates that “the drug will take care of it” – conditioned compensatory response decreases the effect of the drug on the body
Vicarious reinforcement
Learning that occurs by observing the reinforcement or punishment of another person. Children who witnessed an aggressive adult get punished after interacting with Bobo were less likely to be aggressive with Bobo themselves
Social Learning Theory
Albert Bandura The theory that people can learn new responses and behaviors by observing the behavior of others.
Behavioural genetics
The empirical science of how genes and environments combine to generate behavior.
Conditioned Stimulus (CS)
A signal that has no importance to the organism until it is paired with something that DOES have importance (e.g., a bell is the CS in Pavlov’s experiment, and alarm clock tone becomes a CS that makes us grumpy upon hearing it)
Observational Learning
Learning by observing the behaviour of others
Punishers
Effects that decrease behaviours
Variable interval schedule
Rewards delivered after differing time periods - second most effective after variable ratio schedule