Problem 6 Flashcards
Operant conditioning
Refers to learning on the basis of the law effect
–> organisms “operate” on the environment in a way that causes an outcome to occur
Discriminative Stimulus S –> Response R –> Outcome O
Law of effect
States that behaviors followed by positives consequences are strengthened + more likely to be repeated
What is the main difference between operant + classical conditioning ?
- In classical conditioning, organisms experience an outcome (US) whether they perform the CR or not
- In operant conditioning the outcome (O) doesn’t occur if the response (R) isn’t performed
Free operant paradigm
Skinner
Refers to an operant conditioning paradigm in which the animal can operate the experimental apparatus “freely”
–> can respond to obtain a reinforcement when it chooses
Discrete trials paradigm
Thorndike
Refers the an operant conditioning paradigm in which the experimenter defines the beginning + end points
–> more controlled
Skinner box
Refers to a conditioning chamber in which lever press responses (R) while the light is switched on (S), are reinforced by the delivery of food (O)
Cumulative recorder
Device that records behavioral responses
–> height represents the number of responses that have been made up to the present time
Discriminative Stimuli (S)
Refer to stimuli that signal whether a particular response will lead to a particular outcome
ex.: light on –> food, therefore lever must be pressed;
light off –> no food
Shaping
Refers to training, that consists of a series of successive approximations, so that the desired response is learned
Response (R)
Refers to the sequence of movements needed to obtain a particular outcome
ex.: pressing a lever –> door opens (O)
Chaining
Organisms are gradually trained to execute complicated sequences of discrete responses
–> occurs gradually
ex.: learns A, then AB, then ABC
Reinforcer/
Positive outcome
Refers to a consequence of behavior that leads to an INCREASE of likelihood of that behavior in the future
ex.: food when hungry
Primary reinforcers
Refer to stimuli that have innate biological values to an organism
–> organisms will therefore repeat behaviors that provide access to these things
ex.: food, water, sex, sleep
Drive reduction theory
States that all learning reflects the innate, biological need to obtain primary reinforcers
–> one wants to reduce those drives
Why are primary reinforcers not always reinforcing ?
- A reinforcer of the same category can evoke a stronger response than another (Negative contrast)
ex. : will work harder for food they like, than for the ones one doesn’t like - Once the the reinforcer was satiating, further induction won’t be reinforcing
ex. : drinking until not thirsty anymore –> no more water needed
Secondary reinforcers
Refer to stimuli that have no biological value but that have been paired with primary reinforcers
ex.: money –> can be exchanged for food, sex etc
Token economy
Refers to an environment in which tokens can be exchanged for privileges
- -> function the same way as money does in the outside world
- -> used to modify behavior
ex.: prison, school
Negative contrast
Refers to a situation in which an organism will respond less strongly to a less-preferred reinforcer that is provided in place of an expected preferred reinforcer
–> it would have responded more strongly if the less-preferred reinforcer had ben provided all along
Why does the identity of the reinforcer matter ?
Organisms learn that a certain response (R) will result in a PARTICULAR outcome (O)
–> a switch in the outcome may produce changes in responding
Punisher/
Negative outcome
Refers to a consequence of behavior that leads to DECREASE the likelihood of the behavior occurring again in the future
–> opposite to reinforcer
Are punishments as effective as reinforcements ?
No,
the effects of punishment are irratic + unreliable
–> can sometimes result in paradoxical increases in punished behaviour
Which factors determine how effective the punishment will be ?
- Punishment might produce VARIATION IN BEHAVIOR, as the organism explores other possible responses
- Discriminative stimuli for punishment can ENCOURAGE CHEATING
ex. : one will resume speeding, in the absence of police cars - CONCURRENT REINFORCEMENT can undermine punishment
ex. : one will not stop talking in class, when behavior is punished by teacher but simult. reinforced by classmates - Punishment is most effective if a STRONG PUNISHER is used from the beginning
–> if not, one might become insensitive later to stronger ones
Differential reinforcement of alternative behaviors
DRA
Refers to a method to decrease the frequency of unwanted behaviors by instead reinforcing preferred alternate behaviors
–> works best if the rewarded behavior is compatible with the unwanted one
Reinforcement schedule
Refers to a schedule/rules determining how often reinforcement/outcomes is/are delivered in an experiment
When does learning occur the fastest ?
If there is no delay between the response + reinforcement
(Temporal congruity)
–> then the most recent behavior will be associated as a cause for the outcome
Self control/
Delayed gratification
Refers to an organisms willingness to forego a small immediate reward in favor of a larger future reward
Pre-commitment
Making a choice that is difficult to change later
–> will improve delayed gratification
Negative reinforcement
Behavior is reinforced because it causes something to be subtracted from the environment
ex.: headache (S) –> take aspirin (R) –> no more headache (O)
Positive reinforcement
Behavior is reinforced because it causes something to be added to the environment
ex.: present pot (S) –> peeing (R) –> praise (O)
Negative punishment
Behavior is punished by subtracting (taking away) something from the environment
ex.: Siblings –> aggressive behaviour –> grounding
Positive punishment
Behavior is punished by adding something to the environment
ex.: Class –> disturbing the class –> scolding
Continuous reinforcement schedule
Refers to a reinforcement schedule in which every instance of the response is followed by a consequence
–> each response (R) is always followed by an outcome (O)
Partial reinforcement schedule
Refers to a reinforcement schedule in which a response is followed by an outcome less than 100% of the time
–> there are 4 different types
Fixed-ratio schedule (FR)
Partial reinforcement schedule
A fixed number of responses must be made before a reinforcer is delivered
ex.: Press lever 5 times to obtain one food pellet (5:1)
Postreinforcement pause
Refers to a brief pause following a period of fast responding leading to reinforcement
–> length of the pause is related to the number of responses required to obtain the next reinforcement
Fixed-interval schedule (FI)
Partial reinforcement schedule
The first response after a fixed amount of time is reinforced
–> rate of responding gradually increases as the end of the interval nears
ex.: checking time only one time at beginning of class vs checking it every 10 seconds as class approaches the end
Variable-ratio schedule (VR)
Partial reinforcement schedule
A certain number of responses, on average, are required before a reinforcer is delivered
–> steady high rate of responding, because one never knows exactly when reinforcement is coming
=> eliminates/reduces the postreinforcement pause
Variable-interval schedule (VI)
Partial reinforcement schedule
Reinforcement is delivered to the first response after an interval that averages a particular length of time
–> steady high rate of responding
=> eliminates post reinforcement pause
Concurrent reinforcement schedules
Refer to schedules in which the organism can make any of several responses, each leading to a different outcome
ex.: watching the preferred program, but switching over to to different channels during commercials
Matching law of choice behavior
The idea that an organism, given a choice between multiple responses, will make a particular response at a rate proportional to how often that response is reinforced relative to the other choices
–> describes how one will allot ones time + effort among a set of possible operant responses
Behavioral economics
The study of how organisms allocate their time + resources among possible options
What does the economic theory predict ?
Each consumer will allocate resources in a ways that maximizes their “subjective value” or relative satisfaction
–> subjective because it differs from person to person
Bliss point
Refers to the particular allocation of resources that provides maximal subjective value to an individual
Premarck principle
Theory that the opportunity to perform a highly frequent behavior can reinforce a less frequent behavior
–> later refined
Response deprivation hypothesis
States that opportunity to perform any behavior can be reinforcing if access to that behavior is restricted
ex.: not allowed to watch tv if not finished homework before
–> refined premarck principle
Dorsal striatum
Important for stimulus-response learning (S-R), that become automatic/habitual
–> region of basal ganglia
Orbitofrontal cortex
Important for response-outcome learning (R-O), meaning, learning to predict the outcomes of particular responses
- -> codes the identity of an outcome + whether the outcome is reinforcing or not
- -> contains neurons that respond to punishments + rewards
Ventral tegmental area
VTA
Contains dopamine-producing neurons which project to the frontal cortex + other regions of the brain
- -> Neurons are activated by reinforcers + punishers
- -> region of the midbrain
Motivational value
Refers to how had we are willing to work to obtain the stimulus
–> how much we WANT it
Does either of the “wanting” and “liking” signals suffice to evoke responding at the arrival of the reinforcer?
No,
Only when both signals are present, will the reinforcer evoke responding + strengthen the S-R association
–> “liking” isn’t enough
ex.: we may “like” cake, but may not be motivated to head into kitchen to get another slice, when we have already had 3 slices
Substantia nigra pars compacta
SNc
Contains dopamine-producing neurons which project to the dorsal striatum
- -> neurons are activated by reinforcers + punishers
- -> region of the basal ganglia
What does dopamine reduction lead to ?
Changes the willingness to work for a stimulus/ motivated responding
–> liking rate remains the same
Incentive salience hypothesis
States that the role of dopamine in operant conditioning is to signal how much the animal “wants” a particular outcome
What does stimulation of the dopamine system lead to ?
Increased “wanting”
–> can be done naturally by exposure to a stimulus that has previously been associated with reinforcement
Dopamine
Is a naturally occurring neurotransmitter that strengthens learning for S-R associations during operant conditioning + promotes synaptic plasticity
–> signals WANTING/Anticipatory pleasure
Opiates
Refer to a class of drugs that mimic the effects of endogenous opioids, by activating the same receptors as them
ex.: morphine, heroin
Endogenous opioids
Are naturally occurring neurotransmitter like substances with many of the same effects like opiates
–> signals LIKING/consummatory pleasure
Endogenous opioids are released in response to … ?
Primary + sometimes secondary reinforcers
–> differences in the amount being released determines ones preference for one reinforcer over another
Insula
Important for conscious awareness of our own bodies + emotional states
–> signals degree of unpleasantness/”DISLIKING”
Dorsal posterior insula
Important for perceiving physical pain + other negative emotional states
Dorsal anterior cingulate cortex
dACC
Implicated in the motivational value of pain by suggesting an appropriate response
- -> the degree to which pain can drive changes in behavior
- -> its activity level is predictive of whether participants actually change their response
Pathological addiction
Refers to a strong habit that is maintained despite known harmful consequences
Why is addiction so hard to overcome ?
Because it involves
- Positive reinforcement
- -> the “high” elicited by drug - Negative reinforcement
- -> avoiding the withdrawal symptoms
=> both processes reinforce the drug taking behavior
What would damage to the Insula lead to ?
Disruption of drug seeking behavior
–> insular helps maintaining addiction by representing the negative feelings like withdrawal + cravings
Behavioral addictions
Refer to addictions to behaviors, that produce reinforcement as well as cravings + withdrawal when the behavior is prevented
ex.: compulsive gambling
Why is gambling so seductive + highly addictive ?
It is reinforced on a VR schedule
–> you can never be sure when the next big payoff will come
How can you treat addiction ?
- Naltrexone treatment
- -> blocks opiate receptors - Extinction
- -> if R stops eliciting O, frequency of R will decline - Distancing
- -> avoiding triggering stimuli - Differential reinforcement of alternate behaviors (DRA)
=> most effective to combine cognitive + behavioral therapy based on conditioning
Where does info of the Discriminative stimulus travel to ?
Brain regions
- Sensory motor cortex receives info from Discriminative stimulus
- SMC projects to the Basal ganglia
- Dorsal striatum in turn encodes info + projects to Orbitofrontal cortex
Internet addiction
Refers to an impulse-control disorder that does not involve intoxication
–> pathological gambling most similar to addictive internet use
–> internet itself is not addictive but rather specific applications, the more interactive the more it supports pathological use as it provides a unique reinforcement
ex.: social media
Basal ganglia
Includes the
a) dorsal striatum
b) Nucleus accumbens
–> helps linking the associations between motor + sensory cortex so that a stimulus elicits appropriate motor responses