Operant Conditioning (Exam 1) Flashcards
Operant conditioning
Refrain from making certain responses in order to obtain or avoid certain outcomes
Law of effect
any behavior followed by a pleasant consequence is likely to be repeated —–> probability increased
Discriminative Stimulus
in operant conditioning, a stimulus indicating that a particular response (R) may lead to a particular outcome (O)
3-part association
Discriminative stimulus (S^D) —> Response (R) —–> Outcome (0)
Discrete-trials paradigm
an operant conditioning paradigm in which the experiment defines the beginning and the end points of each trial (Edward Thorndike “Cat Box”)
Free-operant paradigm
an operant conditioning paradigm in which the animal can operate the apparatus as it chooses in order to obtain reinforcement (or avoid punishment)
Skinner box
chamber used for operant conditioning and designed so that reinforcement or punishment is delivered automatically whenever an animal makes a particular response (e.g., pressing a lever)
Cumulative recorder
a device used for recording responses in operant conditioning, designed in such a way that the height of the line it draws represents the total (cumulative) number of responses made up to a given time
Discriminative Stimuli
- S^D (light on) —-> R (press lever) —–> O (get food)
- S^D (light off) —-> R (press lever) —–> O (no food)
Operant Conditioning: Responses
Rat may receive food if lever is pressed:
- S^D (lever in box) —–> R (press lever) —–> O (get food)
messy room ——> tidiness ————–> allowance
Shaping
an operant conditioning technique in which successive approximations to the desired responses are reinforced
Chaining
Trained to preform sequence of tasks (involves reward for preformed action)
Reinforces
a consequence of behavior that leads to an increased likelihood of that behavior in the future
Punisher
a consequence of behavior that leads to decreased likelihood of that behavior in the future
Reinforcement
the process of providing outcomes (reinforcers) that lead to increased probability of a particular behavior occurring in the future
Punishment
the process of providing outcomes (punishers) that lead to decreased probability of a particular behavior occurring in the furture
Primary reinforcers
a reinforcer, such as food, water, or sleep, that is of biological value to an organism
* Reinforcers are not created equal*
Drive reduction theory
theory, proposed by Clark Hull, that all learning reflects the innate, biological need to obtain primary reinforcers
Negative contrast
phenomenon in which the reinforcing value of one reward is reduced because a better reward is expected
Secondary reinforcer
value but has been paired with (or predicts the arrival of a primary reinforcer) not a biological value.
Examples: Safety —-> money OR Social value/capital —-> social media likes
Token economy
an environment (such as a prison or school room) in which tokens function the same way as money does in the outside world
Rewarding and reinforcing good habits
Punishment that leads to more behavior
Punishment of (R) decreases the probability that (R) will occur, but does not predict what response will occur instead of (R)
Discriminative stimuli for punishment can encourage cheating
Absence of the discriminative stimuli does not mean the absence of punishment
Concurrent reinforcement can undermine the punishment
The effects of punishment can be counteracted if reinforcements occur with punishment
Initial intensity matters
punishment had to be strong from the outset to be effective
Differential reinforcement of alternate behaviors (DRA)
Method used to decrease the frequency of unwanted behaviors by instead reinforcing preferred alternative behaviors
Positive Reinforcement
type of operant conditioning in which the response causes a reinforcement to be “added” to the environment; over time the response becomes less frequent
S^D (dinnertime) —–> R (set the table) —–> O (Praise)
Positive Punishment
type of operant conditioning in which the response causes an undesirable element to be “added” to the environment; over time the response becomes less frequent
S^D (tastant on thumb)—–> R (thumb sucking) —–> O (bitter taste)
Negative reinforcement
type of operant conditioning in which the response causes an undesirable element to be subtracted from the environment; overtime, the response become more frequent
S^D (headache) ——> R (take aspirin) ——> O (no more headache
Negative punishment
type of operant conditioning in which the response causes a desirable element to be “subtracted from” the environment; over time, the response becomes less frequent
S^D (playing at the birthday party) —–> R (bad behavior) —–> O (loss of playtime)
Reinforcement Schedule
schedule that determines how often reinforcement is delivered in an operant conditoning paradigm
Continuous reinforcement schedule
reinforcement schedule in which every instance of the response is followed by the reinforcers
Partial reinforcement schedule (or intermittent)
a reinforcer schedule in which only some instances of the response is followed by the reinforcer
Fixed-ratio (RF) schedule
a specific number of responses must occur before a reinforcer is delivered; thus FR3 means the reinforcer arrives after every 3rd response
Post-reinforcement pause
in operant conditioning, a brief pause in responding that follows delivery go the reinforcer
Fixed-interval (F1) schedule
reinforcement schedule in which the first response after a fixed amount of time is reinforced; thus FI-1 min means that reinforcement follows the first response made after 1 minute interval since that last reinforcement
Variable-ration (VR) schedule
A reinforcement schedule in which a specific number of responses, on average, must occur before the reinforcer is delivered; thus, VR5 means that, on average, every fifth response is delivered.
Variable-interval (VI) schedule
a reinforcement schedule in the first response after a fixed amount of time, on average, is reinforced; thus VI-1 min that reinforcement follows the first response made after 1 minute interval, on average, since the last reinforcement.
Matching law of choice & behavior
principle that an organism, given a choice between two or multiple responses, will make each response at a rate proportional to how often that response is reinforced relative to the other choices.
Behavioral economics
study of how organisms allocate their time and resource among possible options.
Bliss points
in behavioral economics, the allocation of resources that maximize subjective value or satisfaction.
Delayed discounting
Progressive reduction or discounting of the subjective value of a reward the longer it is delayed.
Self-control
an organism’s willingness to go for a small immediate reward in favor of a large future reward
Altruism
in behavioral economics, an action or behavior that provides benefit to another at the expense of some cost to the actor.
Volunteering, feeding the poor; benefits of survival of population; Also next of kin; component of operant conditioning, seeking rewards
Reciprocal altruism
in behavioral economics, the principle that one organism may donate time or resources to help another in the expectation that the other will return the favor later.
S^D (friends need a favor) —–> R (preform favor now)——-> O (receive future benefits)
Dorsal Striatum stimulus
Response (S^D——>R) Learning
Basal ganglia
brain region that lies at the base of the forebrain and includes the dorsal straitum
Dorsal Striatum
region of the basal ganglia that is important for stimulus-response learning (behavioral response)
Orbitofrontal cortex
brain region that is important for learning to predict the outcomes of particular reponses
Ventral Tegmental Area
brain region the contains dopamine-producing neurons protecting to the frontal cortex and other brain areas
Hedonic value
in operant conditoning is the subjective “goodness” or value of the reinforcer
Motivational value
in operant conditioning, the degree to which an organism is willing to work to obtain access to a stimulation
Hedonic facial expression “YUM”
Tongue profusion (pleasant sweet taste)
Aversive facial expression “UGH”
Gapping (unpleasant bitter taste)
Substantia nigra pars compacta (SNc)
part of the basal ganglia that contains dopamine-producing neurons projecting to the striatum
Incentive salience hypothesis
hypothesis that dopamine helps provide organisms with the motivation to work for reinforcment
Endogenous Opioids: How the brain signals “Liking”
- Morphine makes sweet taste sweeter
- Morphine makes bitter food taste less bitter
- Babies suckle even harder for sweet water
How do “wanting” and “liking” interact?
Dopamine signals “wanting” and Endogenous opioids signal “liking”
Explanation:
- Endogenous opioids could signal “liking”—-> impacting VTA’s ability to signal information about “wanting”
- Different subpopulations of dopamine neurons exist; which convey salience (wanting) and valence (liking) separately.
Insular cortex (insula)
brain region that is involved in conscious awareness of bodily and emotional states and that may play a role in signaling the aversive value of the sitmuli
Dorsal anterior cingulate cortex (dACC)
brain region that may play a role in the motivational value of pain
Addiction
high-seeking (positive reinforcement) and avoidance of withdrawal (negative reinforcement)
Behavioral addiction
medication used to manage alcohol or opioid use disorder by blocking the receptors which cause hedonic reactions