Operant Conditioning Flashcards
Operant conditioning can be formalized as a three-way association:
S->R->O
Stimulus Response Outcome
Operant conditioning
Learning to make or refrain from certain responses in order to obtain or avoid certain outcomes
Discrete trial paradigm:
Experimenter defines the beginning and end of each trial
Free-operant paradigm
Animal can operate the apparatus freely
Discriminative stimulus
Stimuli that signal whether a particular response will lead to a particular outcome
Shaping
Reinforcing successive approximations (each step)
Chaining
Gradually trained to execute complicated sequences of discrete responses
Shaping and chaining are techniques for what kind of reinforcement?
Positive
The _____ determines the change in behavior
Outcome
Two outcomes that occur in operant conditioning:
Reinforcers and punishers
A consequence of behavior that leads to increased likelihood of that behavior in the future
Reinforcer
Primary reinforcer
Satisfy an innate drive (food, water, sleep)
Secondary reinforcer
Initially have no intrinsic value but have been paired with primary reinforcers (money)
Positive reinforcement
Performance of the response causes the reinforcer to be added to the environment (clean room, get allowance)
Negative reinforcement
Behavior is encouraged because it causes something to be taken away from the environment (take aspirin, stop headache)
Consequences of a behavior that lead to decreased likelihood of that behavior in the future
Punishers
Positive punishment
Response leads to negative effects (tease sister, get spanked)
Negative punishment
Response ends/avoids a positive effect
Partial reinforcement effect
Some but not all responses are reinforced
Continuous reinforcement schedule
Each response is followed by the outcome
Fixed ratio
Fixed number of responses must be made (stair-step pattern)
Fixed interval
Reinforces the first response after a fixed amount of time (scalloped curve)
Variable ratio
Reinforcement after a certain average number of responses (biggest rate of response)
Variable interval
Reinforces the first response after an interval that averages a particular length of time
Matching law of choice behavior
Response rates to concurrent schedules often correspond to the rate of reinforcement for each schedule (given two responses)
Brain substrates
Dorsal striatum
Orbitofrontal cortex
One of the pleasure centers is:
The central tegmental area VTA
Endogenous opioids
Liking systems (endorphins)
Incentive salience hypothesis
Dopamine motivates learners to work for reinforcement
Anhedonia hypothesis
Dopamine gives reinforcers their goodness
Pathological addiction
A string habit maintained despite harmful consequences