instrumental conditioning Flashcards
Instrumental conditioning or operant conditioning
Learning a contingency between a behaviour and a consequence
A key difference from classical conditioning is that here we are considering overt behaviours that are operated by an actor leading to a reinforcer
Law of effect
Behaviours with positive consequences are stamped in and performed frequently
Behaviours with negative consequences are stamped out and performed less frequently
Reinforcer
Any stimulus that is presented after a response that impacts the frequency that the response is performed. Behaviours can be changed through either presentation or removal of reinforcers
Presentation of a positive reinforcer and removal of a negative reinforcer
Increase in behaviour
Presentation of a negative reinforcer and removal of a positive reinforcer
Decrease in behaviour
Reward training
Presents a positive reinforcer to encourage a behavior
Punishment training
Presents a negative reinforcer to discourage a behaviour. This could be unethical or authority figure that may inflict fear
Omission training
Removes a positive reinforcer to discourage a behaviour. Time out is an example of this. Punishment and omission lead to decrease in unwanted behaviours they use different methods
Escape training
Removes a negative reinforcer to encourage a behaviour
Acquisition
Learning a contingency between a response and its consequence and acquisition depends on the response rate of a behaviour
Cumulative graph for the response rate of a behaviour
Horizontal line = no response
Upward slope= a response has been made
The pattern of responding depends on the participant, the complexity of the behavior and the type of behaviour used. Y axis is the cumulative behaviour and x axis is time
Auto shaping
Learned without any direct guidance. An example pigeon in a cage pecks the keyhole and gets a grain. This contingency is learned without any help
Shaping through successive approximation
Used for behaviours that are too complex to be auto shaped through gradual smaller approximations and rewards are presented. Used by animal trainers.
chaining
A technique used to develop a sequence of behaviors. Each behaviour is reinforced with the opportunity to perform the best behaviour in a sequence. Helpful for learning complex behaviour
Shaping vs chaining
Shaping: a closer approximation of the desired final behavior than the behaviour last reinforced. Reinforcement on the basis of improvement
Chaining: Reinforces the behaviour so long as it is performed in a defined order. Behaviour and order are set prior to the training
Discriminative stimulus (SD/S+)
Indicates when a contingency is valid
SDelta/S-
Indicates a contingency is invalid
Partial reinforcement
Follow a ratio (responses) or interval schedule (time). Both can be fixed or variable
Four basic schedules of reinforcement
Fixed ratio (FR), variable ratio (VR)
Fixed interval (FI), Variable Interval (VI)
Fixed ratio
May lead to ratio strain. It follows a pause and run pattern for behavioral responses
Variable ratio
Schedules reinforcement after a set average number of responses and can support a high response rate of behaviour or climbing slope
Fixed interval
Delivers reinforcement after a set interval of time. Rarely seen outside of the lab
Variable interval
Deliver reinforcement after a Sey average amount of time. Steady rate response (straight line)
Robust learning
Partial reinforcement is better than continuous reinforcement and less susceptible to extinction. Variable schedule is more robust than fixed schedules
Primary reinforcer
reinforcer with intrinsic value like food, water, mate
Secondary reinforcer
Reinforcer through previous learning and can be exchanged for a primary reinforcer. Example money
Negative contrast
A response originally receiving a high reward is shifted to a lower reward resulting in reduced response
Positive contrast
A response originally receiving a low reward is shifted to a high reward resulting in increased response
Over justification effect
Promoting intrinsic motivation is important for the long term adoption of a behaviour. Because if one relies on the extrinsic rewards, when the rewards stop they will lose motivation
Mirror neurons
Most organisms generate involuntary motor responses roughly equivalent to that of any behaviour they observe