Task 6 - Instrumental Conditioning Flashcards
Give a definition of instrumental conditioning.
Instr. / Operant Conditioning is the process whereby an individual learns to (not) do sth. based on the outcome of an action.
Sd -> R -> O; If O=positive, Sd-R-Association is strengthened.
What research lead to the discovery of operant conditioning?
Thorndike’s Puzzle box
-> Animals had to escape the box using mechanisms, in order to get to the reward.
Which law has been deducted from Thorndike’s Puzzle Box?
Law of Effect -> Behavior followed by a positive outcome is more likely to be repeated.
What is a Habit Slip?
Executing a learned action without considering any other cues. (Getting up and getting ready for work out of habit on a Sunday.)
What is the difference of classical vs operant conditioning?
In classical conditioning, the CS is always followed by the US, whereas in operant conditioning, the individual first has to perform an action (R) for an outcome to occur.
Describe two famous paradigms in operational conditioning. (Derived from Thorndike & Skinner)
- Discrete Trials Paradigm - Thorndike:
- > Experimenter defines start and end of each trial
- > Puzzle-Box - Free-operant Paradigm - Skinner:
- > The subject operates the experiment independently.
- > Skinner Box (modification of puzzle-box)
Explain Shaping
- Reinforcing behaviors, that ultimately lead to the desired behavior
- > The experiment doesn’t rely on the test subject finding out the desired behavior by accident, but is “shaped” towards it.
What is chaining?
- Gradually (step-by-step) training individuals to execute complex sequences of behavior
- > Teaching the individual steps in reverse order can be efficient in some cases
Describe the term “reinforcer” and its two subtypes.
Reinforcer: A consequence of behavior, that increases the likelihood of this behavior being repeated
- > Primary reinforcer: a reinforcer that has innate biological value
- > Secondary reinforcer: A reinforcer that is associated with a primary reinforcer, but is none itself - like money
- these secondary reinforcers always work as motivation for action whereas an individual can be satisfied from a primary reinforcer.
What is the principle of negative contrast?
Expecting to receive something but then the actual amount is lower will result in disappointment. If you expected the same low amount however, there will be no disappointment.
What are punishers and what are their characteristics / limitations?
- > A consequence of behavior that leads to a decrease in likelihood of repetition
- > Less efficient and predictable in controlling behavior than reinforcement
- > Can train someone to only suppress a certain action in the presence of one Sd and not in general.
- > Is most effective if the punisher is strong and used from the first second.
What is an alternative form of preventing unwanted behavior?
Differential Reinforcement of alternative behaviors
Explain all the different forms of positive/negative punishment/reinforcement.
Positive Punishment: Action causes punishment to occur
Negative Punishment: Action causes a reinforcer to be taken away
Positive Reinforcement: Action causes reinforcer to occur
Negative Reinforcement: Action causes punishment to be taken away
What are alternative for positive / negative in this context?
Positive - Additive
Negative - Subtractive
Summarize Hull’s Drive Reduction Theory.
Theory that says that we have an innate drive to obtain primary reinforcers and that learning is driven by the desire to satisfy these needs.
Name the three main subtypes of reinforcement schedules.
- Continuous
- Partial / Intermitted
- Concurrent
Explain Continuous reinforcement schedules
R is always followed by a certain outcome
Explain the four types of Partial / Intermitted schedule
- Fixed Ratio Schedule: A specific number of R will lead to an O
- Variable Ratio Schedule: On average, after a certain number of R, an O will occur
- Fixed Interval Schedule: O will be able to be obtained after a certain time period
- Variable Interval Schedule: The first R after a certain average time period results in an O.
What are characteristics of Concurrent Reinforcement Schedules?
- The subject has the option of exhibiting a number of different R’s, which all have a different reinforcement schedule.
- > Lets us study behavioral economics
Explain the Matching Law of Choice Behavior
The ratio of time spent on the different R’s in a concurrent reinforcement schedules situation is roughly equal to the ratios of maximal O-frequency of the associated R’s
What is a Bliss-Point?
An arrangement of exhibited R’s that leads to your subjective optimal R-O-Ratio
When the option of performing a preferred R afterwards motivates the performing of a less liked R, this is called the ______.
Premack Principle
What is a refinement of Premack’s Principle?
Response Deprivation Hypothesis:
-> The option to perform any behavior A can reinforce the performance of a behavior B, if the access to A is restricted.
Which brain structures are of central importance for operant conditioning?
- Basal Ganglia
- Orbitofrontal Cortex
- Ventral Tegmental Area
What is the function of the Dorsal Striatum (Basal Ganglia) in terms of conditioning?
- It receives input from sensory cortical areas
- Gives output to motor cortex
- necessary for Sd-R-Associations
How does the Orbitofrontal Cortex contribute to conditioning?
- Contributes to goal-directed predicting
- Projects to the Dorsal Striatum
- Neurons code for an expectation of punishment and what the outcome of an action is
- Neurons fire in correlation to the perceived value of each choice
What are two important brain structures for neural reinforcement and what do they do?
Ventral Tegmental Area (Midbrain):
- powerful pleasure centrum if stimulated
- produces dopamine
- projects to frontal cortex and:
Substantia Nigra Pars Compacta (SNc, part of Basal Ganglia):
- produces dopamine
- projects to dorsal striatum
What kind of values can reinforcers have?
Hedonic: Subjective Value or Quality
Motivational: Amount of work someone is willing to put into achieving it.
What role does Dopamine play?
- It is released during positive reinforcement
- It motivates us to work for things (motivational value), however it doesn’t add hedonic value
- Enhances Sd-R-Associations because it promotes synaptic plasticity
- Can have unpredictable effects
What is the Incentive Salience Hypothesis?
That even without Dopamine, we still “like” and “dislike” things, but our willingness to work for them diminishes.
Explain the Reward-Prediction Hypothesis
If a reinforcement comes unpredicted, it will cause more dopamine release.
What are Endogenous Opioids?
- Naturally occurring “opiates” (peptides)
- Decrease pain and produce euphoria
- It is believed that these peptides control the hedonic value of things
Define “Addiction”
A strong habit that is continued despite harmful consequences
Why is addiction maintained?
- > Positive reinforcement: It gives us pleasure
- > Negative reinforcement: It helps us avoid withdrawal effects
- > Many drugs manipulate our dopaminergic system and thus make us want to work to obtain it.
Drug craving is associated with activity in which brain area?
Insula
What is Naltrexone and what are its effects?
- A drug that blocks opiate receptors
- > decreases the ability of drugs to bind to these receptors and cause a hedonic reaction
- > Doesn’t reduce craving
- > Must be taken daily
What are other ways of treating an addiction?
- Self-help groups / Cognitive Therapy
- Achieving Extinction
- Distancing
- Delayed reinforcement (reduces number of intake per day, weakens learning)
- Differential reinforcement of alternative behaviors