Task 6 - Instrumental Conditioning Flashcards
Operant conditioning
process whereby organisms learn to make or refrain from making certain responses in order to obtain or avoid certain outcomes
example: Thorndikes puzzle box
Reinforcement
this process of providing an outcome for a behaviour that increases the probability of that behaviour
when deciding whether paradigm is operant or classical
- -> focus on the outcome
- when the outcome happens regardless –> classical
- when the outcome only happens by chance (if one does something) –> operant
Free-operant paradigm
animal could operate the apparatus freely, whenever it chose (f.e. when Thorndike added a return ramp to his puzzle box)`
Discrete trials paradigm
trials were controlled by the experimenter
Skinner box
he devised the cage – with a trough in one wall through which food could be delivered automatically
Cumulative recorder
A learning curve drawn by a pen that moves across a roll of paper at a steady rate, increasing its vertical height by a fixed amount for every response of an organism, such as a lever press by a rat in a Skinner box or a peck by a pigeon of an illuminated plastic key – f.e. odometer in the car
Discriminative Stimuli
stimuli that signal whether a particular response will lead to a particular outcome
–> they help the learner discriminate or distinguish the conditions where a response will be followed by a particular outcome
Discriminative Stimuli
stimuli that signal whether a particular response will lead to a particular outcome
–> they help the learner discriminate or distinguish the conditions where a response will be followed by a particular outcome
Shaping (??)
in which successive approximations to the desired response are reinforced
Chaining (–> backward chaining)
technique in which organisms are gradually trained to execute sequences of discrete responses
- related technique to shaping
- -> sometimes more effective to train the steps in the reverse order
Chaining (–> backward chaining)
technique in which organisms are gradually trained to execute sequences of discrete responses
- related technique to shaping
- -> sometimes more effective to train the steps in the reverse order
Reinforcer
is a consequence of behavior that leads to increased likelihood of that behavior in the future
Primary reinforcers
they are of biological value to the organism, and therefore organisms will tend to repeat behaviors that provide access to these things
- examples: Food, water, sleep, the need to maintain a comfortable temperature, and sex
Drive reduction theory (Clark Hull)
proposed that all learning reflects the innate, biological need to obtain primary reinforcers
–> complication: primary reinforcers are not always reinforcing
secondary reinforcers
reinforcers that initially have no biological value, but that have been paired with (or predict the arrival of) primary reinforcers (can be as strongly encouraging as primary enforcers)
– Example: money
Token economies
often used in prisons, psychiatric hospitals, and other institutions where the staff has to motivate inmates or patients to behave well and to perform chores such as making beds or taking medications
- tokens function in the same way as money does in the outside world
- Animals as well will work for secondary reinforcers
negative contrast:
organisms given a less-preferred reinforcer in place of an expected and preferred reinforcer will respond less strongly for the less-preferred reinforcer than if they had been given that less-preferred reinforcer all along
– F.e. the monkey that throws the cucumber because it is the less preferred food, once he saw the grapes
Punishment
the process of providing outcomes for behaviour that decrease the probability of that behaviour – the response decreases
Punishers or negative outcomes
common punishers for animals include pain, confinement, and exposure to predators (or even the scent of predators)
Four most important factors that determine how effective the punishment will be
- Punishment leads to more variable behaviour.
- Discriminative stimuli for punishment can encourage cheating
- Concurrent reinforcement can undermine the punishment
- Initial intensity matters
Differential reinforcement of alternative behaviors (DRA)
A process – rather than delivering punishment each time the unwanted behaviour is exhibited, it’s possible to reward preferred, alternate behaviours
Reinforcement schedules
the rules determining when outcomes are delivered in an experiment
Timing affects learning
Normally, immediate outcomes produce the fastest learning
Delays undermine the punishments effectiveness, and may weaken learning
Response consequence delay
the longer one waits to punish something/someone the less the association will be made between the punishment and the …
Self-control
an organism’s willingness to forego a small immediate reward in favor of a larger future reward
Positive (reinforcement)
positive does not mean good → instead it means added
Positive reinforcement
the desired response causes the reinforcer to be added to the environment