learning final exam Flashcards
motivating operations
-a type of antecedent stimuli
-alter the value of a reinforcer
-make the behavior that produces the reinforcer more or less likely to occur
-temporary effects, only when stimulus is operating
establishing operation
-a type of motivating operation (antecedent)
-Antecedent events that make you want a reinforcer more and as a result increase behavior to get the reinforcer
-temporary effects
ex: Spilling sriracha sauce on your shirt before an interview (increases the value of a clean shirt and the probability of finding and changing into one, or cleaning the one you are wearing)
ex:
-not having eaten for a long time
-eating salty food makes you want water
-sleep deprivation
-being in pain
-not having enough money
-not having attention from others
abolishing operation
-a type of motivating operation
-environmental events that occur that make a person NOT want something anymore
-temporary effects
ex: you eat too much on Thanksgiving and your stomach hurts. This makes you not want to eat for a while
ex:
-having too much of something
-prolonged exposure to a reinforcer
stimulus control
when antecedent stimuli
-make behavior more or less likely to occur
-make the rate of behavior increase or decrease
people, places, things, etc that cue or evoke behavior
doesn’t make you want a reinforcer, which makes it different from MOs
ex:
-at an intersection while driving, you stop when the light is red and go when it is green
-Antecedent - green light; red light
-Behavior - press the accelerator; press brake pedal
-Consequence - you get to go; no accident
differential reinforcement
-in DR, antecedent stimuli are established as Sds and s-deltas
-specific desired behavior is actively reinforced while simultaneously withholding reinforcement for undesirable behavior
ex: a teacher calls on a student who raises their hand, but not a student who yells out the answer
Sd (discriminative stimulus)
-an antecedent stimuli
-signals that a behavior will be reinforced
-evokes behavior
ex:
-lit vending machine
-walk symbol for crossing the street
S-delta
-antecedent stimuli
-signals that a behavior will not be reinforced
-suppresses behavior, behavior is less likely to occur
ex:
-dark/off vending machine (know it won’t give the snack)
-stop hand for walking into/crossing the street
transfer test
-an assessment that measures how well someone can apply/adapt knowledge or skills learned in one context to a different, new situation
-can learned behavior be applied in new conditions/contexts
-lemur test with A,B,C,D: question was whether, when presented with the novel pairings, the lemurs would transfer their understanding of the relative rankings to the novel pair
stimulus discrimination
-differentiating between similar stimuli and only responding to the one that is associated with reinforcement
-can be trained through discrimination training - can cause a peak shift effect which is when the peak of a generalization gradient following discrimination training will shift from the Sd to a stimulus that is further removed from the s-delta
-occurs often in classical conditioning
ex: dog salivates to tone of 2000Hz but not 1900Hz
matching law
-law that states that the proportion of responding in a concurrent schedule will equal the proportion of reinforcement on schedule
ex: A child spends more time on math homework if it provides more
frequent praise than reading homework
concurrent schedule (CONC)
-type of reinforcement schedule where there are two or more response options that each have their own Sds and reinforcement schedule (independent requirements)
-allows us to see how/where people respond
-can switch back and forth between 2 response options to receive reinforcers from both options
ex: choosing between watching TV or tiktoks, each of which have their own reinforcement schedule (tiktok - more frequent reinforcement)
-used to study how we make choices/decide between different options
-Herrnstein pigeon experiment
VR schedule
-variable ratio
-Reinforcer is delivered after an average (unpredictable) number of responses have been made
-ex: payout on slot machines are typically high ratio
-high and constant response rate
-few or no post-reinforcement pauses
-produces the highest response rate because it is beneficial for the person to continue responding
VI schedule
-variable interval
-reinforcer is delivered for the first response made after an unpredictable amount of time has passed
-ex: tiktok algorithm: not every video is what you want to see, but as you scroll and spend time on the app, eventually a video will be what you want to watch
-fairly consistent, moderate rate of checking (is the reinforcer available yet?)
self-control choice
self-control choice characteristics:
-usually involve delays to a better reinforcer (choosing a larger, later reinforcer over sooner, smaller one)
ex: saving $ instead of spending it impulsively
-preference change over time
-frequently choose the lesser reinforcer because they get it sooner (impulsive choice)
adjusting delay procedure
procedure that determines the subjective value of a reinforcer at different delays/determines preferences between reinforcements
-“would you rather receive $500 today or $1,000 in one year?”
-offer different dollar amount today
-use different delay periods (1 year, 6 months)
-determine the indifference point
-results reliably predict: class grades, sunscreen use, seat belt use, drinking, gambling, drug use, risky sexual behavior
value discounting
the decrease in perceived value of a reward as the delay to receiving it increases
ex: think of homework question about the value of $100 over time
-by 300 days, value was down to very small amount compared to the original value at start
self-control
-the ability to delay gratification and choose the delayed reinforcer over the immediate one
ex: choosing to study for a test instead of watching TV
-everyday choice often involve a conflict between short term and long term interests
-choice that is made is often not in our best interest
-consistently choosing the larger, later reinforcer over a smaller, sooner reinforcer
-preferences can change over time
self-monitoring
-Observing and recording your own behavior
-Do an A-B-C analysis
-Why to do this:
Own behavior is sensitive to reactivity (change in behavior when being observed)
-can produce an improvement in behavior
-often temporary because reactivity is temporary
-part of self-management
ex:
-tracking daily exercise, sleep habits, homework habits, diet, etc
impulsive choice
-choosing the smaller, sooner reinforcer over the larger, later reward
-due to getting the reinforcer sooner, instant gratification
-behavior is influenced more by immediate consequences than delayed consequences
commitment strategy/response
-in advance arranging the environment so that it is difficult or impossible to change your mind when later faced with temptation
ex: need to study for finals, so have roomates hide phone or tv remote so that I have to study and limit distractions
ex: Gary would love to go running each evening but is always so tired when he gets home that he plops on the couch and eats chips. Gary decides to arrange specific times each evening to go running with a friend
self-reinforcement/management
-reinforcing your own behavior
-a self-control tactic
-providing reinforcement for oneself when done the desired behavior
ex: after studying for finals for 3 hours, i go get myself an ice cream from Chucks
self-punishment
-people apply an aversive consequence to themselves each time they engage in an unwanted target behavior
-ex: everytime I go to Target and spend more than $20, I have to do 20 pushups (i hate pushups)
-can decrease the frequency of own behavior
-problematic because people have the tendency to not carry out the punishment
behavior contract
-a self-management tactic
-written agreement which outlines specific behaviors that will happen and associated consequences or rewards
ex:
stickk weight loss program
-you commit to exercising/losing a certain amount of weight in a period of time or else your money is donated to a charity
-if you do the exercise/complete the contract, no money is lost
rules
-verbal descriptions of a 3-term contingency
-describes what happens if a person engages in a behavior in a context
-rules function as an Sd
-direct experience with the contingency may never have occurred
ex:
if you eat all of your dinner, you will get a sweet treat
if you speed, you’ll get a ticket
contingency-shaped behavior
-Behavior acquired and maintained by experiencing reinforcement, extinction or punishment
-Behavior shaped by direct experience with reinforcement and punishment
-ex: Learning not to touch a hot stove after being burned
rule-governed behavior
-behavior that is learned through verbal or written rules instead of through experiences or natural contingencies
-learned/taught during childhood/very early through exposure to rules
-pliance and following rules are both rule-governed behavior
-parents make requests and reinforce compliance, which makes compliance with requests more likely in the future
-experience with following instructions for completing tasks
-ignoring instructions can lead to a difficult time
-following rule-governed behavior results in praise, acknowledgment, access to privileges, avoidance of reprimands, etc
instructions
-Verbal or written descriptions of what to do to achieve reinforcement
-a type of rule-governed behavior
-help establish appropriate patterns of behavior
-simply have to follow rules that we have been given in order to behave effectively in settings instead of having to have direct experience with the setting
pliance
-Rule-governed behavior that is maintained by social positive or negative reinforcements
-following rule-governed behavior results in praise, acknowledgment, access to privileges, avoidance of reprimands, etc
-Ex: when the national anthem plays at a sporting event, you may stand up and take hat off so that others don’t look at you funny, not because you’re patriotic
-ex: Sam’s father states, “If you know what’s good for you, you will clean your room right now.” Sam begins cleaning his room - a socially mediated consequence
tracking
- a rule-governed behavior
-maintained because instructions appear to correctly describe the contingencies that operate in the real world
-following instructions was reinforced in the past
-will only continue to follow rule-governed behavior so long as individual still thinks that it reflects real life situations
ex:
-use GPS/Waze because it tells you to avoid the cops
-follow a recipe to cook rice: you will continue to follow the recipe because it appears to look right and will cook correctly
imitation
-close duplicate of a novel behavior
-a prerequisite for observational learning
-Meltzoff and Moore 1977
-12 to 21 day old babies were presented with passive adult faces for 90s, modeled stimulus presented 4 times for 15s, imitation period for 70s (passive face)
-imitation is important for observational learning
-not having the skills for observational learning can cause issues
ex:
-child is homeschooled then goes to school in middle school. They don’t know what to do when the fire alarm goes off. If they do not know how to imitate or observationally learn, they might not know what to do
observational learning
-the behavior of a model is witnessed by an observer and the observer’s behavior is subsequently changed
-essentially a social process
-involved in both classical and operant conditioning
-observer sees consequences of the model’s behavior
-looking at the model is reinforced; looking/paying attention to them is very important
-observer MUST have the skills to do the modeled/observed behavior
model
-The individual demonstrating a behavior during observational learning
ex: teacher showing how to do a math problem for their students
-more likely to pay attention to a model if: they are similar to observer, if they’re authority figure, if they’re attractive
-model’s behavior is reinforced
-observer must pay attention to the model!
observer
-the person who watches/observes the behavior in observational learning
-observer will be reinforcer for attending to the model’s behavior; very important
-observer must have the skills to do the observed behavior
ex: if teacher is showing a hard astrophysics problem, observer may not pay attention if they know that they can’t do the problem on their own
-Imitating by the observer is likely to be reinforced (Sd’s are present)
-Imitating by the observer has been reinforced in the past
how MO’s affect the value of reinforcers and the rate of behavior
MO’s:
-Make the behavior that produces the reinforcer more or less likely to occur
-alter the value of a reinforcer (make people want it more or less and behavior will increase or decrease depending on the value)
antecedents (MOs) affect the current rate of behavior based on past experiences
MO examples
-Establishing operations (EOs) examples:
Not having eaten for a long time
Eating salty food
Sleep deprivation
Being in pain
Not having enough $
Not having had attention from others
-Abolishing operations (AOs) examples:
Having eaten a lot of food
Having too much of something
Prolonged exposure to a reinforcer
compare and contrast EOs and AOs
contrast:
EO:
-makes you want reinforcer
-evokes behavior in the moment
AO:
-make you NOT want the reinforcer
-decreases behavior
compare:
-both antecedent events and motivating operations
-both alter the value of a reinforcer
-both make the behavior that produces the reinforcer more or less likely to occur (alter the rate of behavior)
stimulus control examples
ex: At an intersection while driving, you stop when the light is red and go when it is green
-Antecedent - green light; red light
-Behavior - press the accelerator; press brake pedal
-Consequence - you get to go; no accident
ex: When the vending machine light is on, you drop in some coins and get a snack. You do not do this when the light is off
A - light on; light off
B - put in $; put in $
C - get a snack; no snack
3-term contingency
-Antecedent - behavior - consequence
-Antecedent: stimulus control - motivating operations
-Consequence: reinforcement; extinction; punishment
-Antecedent stimuli - people, places, emotional states, things present before the behavior occurs; Set the context for behavior
-Antecedents: Present before the behavior occurs; Affect the current rate of behavior based on past experiences
-Consequences: Happen after behavior because the behavior occurred; Affect the future rate of behavior
differentiate an antecedent stimulus from a discriminative stimulus
-Discriminative stimulus (SD): specific type of antecedent stimulus that signals a behavior will be reinforced (has stimulus control over behavior)
-Antecedent stimulus: present before the behavior occurs, affects the current rate of behavior based on past experiences
identify discriminative stimuli in an example
-Lit vending machine signals that snack will be available
-Walk symbol is a SD for walking into the street - signals that it is safe to cross street
how does stimulus control develop (differential reinforcement)
-develops through differential reinforcement by selectively reinforcing a specific behavior in the presence of a particular stimulus, while not reinforcing the same behavior when other stimuli are present
-behavior is reinforced in the presence of an Sd but not in the presence of a s-delta
ex:
-child learned to say “please” because it is reinforced with attention, while demanding is ignored
-this is differential reinforcement and stimulus control
describe transfer tests and why they’re used
-Tests if a learned behavior applies in a new context without additional training.
-Purpose: To evaluate generalization of learning
-Example: Testing if a trained dog responds to “sit” in a new environment (trained command in the house, will dog still do it at the park)
describe principles of stimulus control used in Hero Rat training and teaching parrots to facetime parrot friends
-Hero Rats: Use differential reinforcement to train rats to detect specific smells (e.g., landmines)
(NOTE: IT’S THE SMELL NOT THE LANDMINE ITSELF)
-Parrots: Use stimulus discrimination to teach parrots to recognize tablets and communicate via Facetime.
Reynold’s experiment and what it told us about stimulus control
-experiment: Demonstrated stimulus control by showing that pigeons
could be trained to respond to one aspect of a complex stimulus (e.g., color or shape), highlighting discrimination learning
-pigeons could distinguish between the two visual stimuli (colored key or shape) and would respond selectively based on the features of the stimuli
how can differential reinforcement be used to teach skills
ex: Teaching a child to share toys by praising sharing and ignoring grabbing or withholding toys from others
-this is differential reinforcement because one behavior is reinforced/praised (sharing), while another is not reinforced (grabbing, not sharing)
define choice and give examples
choice is how we decide between different options
-Selecting between two or more behaviors, each associated with different reinforcement
-deciding between watching TV or studying for an exam - each has different reinforcement (TV is instant, studying is delayed)
describe a concurrent schedule and what behavior reinforced on this schedule looks like
concurrent schedule is when there are two or more response options that each have their own individual reinforcement schedule
-example could be slot machines at a casino. the machine on the left is reinforced on a VR 25 schedule while the machine on the right is a VR 75 schedule. The individual will respond on both machines, but majority of their responses will be on the left machine (VR25) because it is reinforced more frequently
contrast expected behavior patterns on concurrent VR schedules with that on concurrent VI schedules
-VR Schedule: Produces high, consistent responding (slot machines).
-VI Schedule: Produces steady, moderate responding (checking email)
Herrnstein’s experiment
-demonstrated the matching law
-training pigeons to peck at two keys on concurrent variable interval (VI) schedules
-the pigeons distributed their responses in proportion to the reinforcement rates of the schedules (more often reinforced schedule receives more responses)
-showed that behavior is sensitive to the relative rate of reinforcement
describe matching law and be able to calculate proportions using it
the proportion of responses to an option
matches the proportion of reinforcement obtained from that option
formula:
-Total amount of responding = responses on r1 + responses on r2
-Responding on r1 = amount of responses on r1 / total number of responses from both
B1/B1+B2 = R1/R1+R2
describe an experiment looking at matching in a social situation
-Matching can occur in social interactions, such as a group of children playing with two teachers
-The children spend time with each teacher proportional to the attention or reinforcement (e.g., praise) they receive from each
-if one teacher is attending and praising the children more often, the children will spend more time interacting with that teacher than the other
graphs of matching
-A graph of matching behavior typically shows a straight line
-indicating proportionality between the response rates and reinforcement rates
-Deviations from the line indicate under- or over-matching
define self-control, give examples
-The ability to choose a larger, delayed reward over a smaller, immediate reward.
-Ex: Choosing to save money for a vacation instead of buying something unnecessary
why are explanations of self-control that invoke willpower not useful
-cannot rely on willpower alone
-willpower is vague and relies on circular reasoning
-either have willpower or don’t
-the real reason that people have difficulty controlling their behavior is because of the consequences for different choice behaviors
-other theories like Ainslie-Rachlin explain choice better and use measurable variables instead of “willpower” idea
different types of self-control choices
-Immediate reinforcers versus delayed punishers
*The enjoyment of drinking too much with friends vs a hangover
-Immediate reinforcers vs a cumulatively significant punisher
*Eating extra dessert vs too many calories, excess cholesterol, etc
-Small immediate punishers vs delayed, cumulatively significant reinforcers
*Not smoking now involves an immediate punisher and delayed reward
-Immediate smaller punisher vs a larger, later punisher
*Go to the doctor at the first symptom or waiting
Marshmallow experiment
-In this experiment, children were given the choice between eating one marshmallow immediately or waiting to receive two marshmallows
-Results showed that delayed gratification (self-control) was associated with better life outcomes, such as:
-higher academic achievement
-Were better able to cope with frustration
-Got along better with their peers