L11 - Operant Conditioning Flashcards
Operant Conditioning :
Operant (instrumental) conditioning: Learning controlled by the consequences of the organism’s behaviour
Operants: Behaviours produced in order to receive a reward
Classical vs Operant:
- Classical = Autonomic reflex response + stimuli presented independent of behaviour (e.g. irrespective of the amount of saliva dog produces, UCS and CS are still presented)
- Operant = Voluntary behaviours + stimulus presence/absence is conditional on the behaviour (e.g. only gets the treat when it performs the trick)
Law of effect:
If a stimulus followed by a behaviour results in a reward, stimulus is more likely to elicit behaviour in the future
- No “ah ha!” moment for determining the correct solution; just became more efficient at the trial and error process
Extinction burst:
Brief ↑ in the intensity of a response during extinction
Positive reinforcement techniques used for animal training:
- Shaping: Progressively reinforcing behaviours that come closer and closer to the target behaviour
- Chaining: Linking simple interrelated behaviours together, with each behaviour becoming a cue for the next behaviour
Phobia and superstitious behaviours:
- Accidental operant conditioning might be partially responsible for superstitious behaviours
- Phobia two-process theory: Acquire phobia via classical → Strengthened and persists due to operational with ongoing avoidance of the stimulus acting as a negative reinforcement
Types of Operant Conditioning: Positive vs negative:
- Positive = Adding a stimulus
- Negative = Removing a stimulus
Types of Operant Conditioning: Reinforcement vs punishment:
Reinforcement = Trying to ↑ desired behaviour
- Partial reinforcement (Humphreys’ Paradox): Behaviours are more difficult to extinguish if they have only been occasionally reinforced rather than continuously reinforced (e.g. gambling) [See schedules of reinforcement]
Punishment = Trying to ↓ unwanted behaviour
- Only tells subject what NOT to do without providing information about what behaviour should be done instead
- Can result in anxiety, subversive behaviour and aggression
- Best applied selectively IN CONJUNCTION with reinforcing a desired behaviour
Type of operant conditioning: EXAMPLES
Positive reinforcement: Award (desirable stimulus applied) provided when grades are high (desired behaviour)
Negative reinforcement: Car stops beeping (undesirable stimulus removed) at you when you put your seatbelt on (desired behaviour)
Positive punishment: Child gets in trouble (undesirable stimulus applied) for telling a lie (unwanted behaviour)
Negative punishment: Child is sent to bed without dessert (desirable stimulus removed) for refusing to eat their spinach at dinner (unwanted behaviour)
Schedules of Reinforcement:
Effectiveness of operant conditioning is determined by the pattern in which a desired behaviour is promoted during the acquisition phase. There are 2 dimensions
- Continuous reinforcement
- Partial reinforcement
Continuous reinforcement =
Consequence is given every time a target behaviour is performed.
Humphreys’ Paradox – A phenomenon whereby behaviours are more difficult to extinguish if they have only been occasionally reinforced rather than continuously reinforced
Partial reinforcement :
Reinforcing a target behaviour intermittently rather than continuously
- Fixed ratio - reinforcement is provided after a fixed number of responses
- Fixed interval - reinforcement is provided after a fixed time has elapsed (provided a behaviour is performed)
- Variable ratio - reinforcement is provided after an averaged number of responses.
- Variable interval - reinforcement is provided after an average time has elapsed (provided a behaviour is performed)
Ratio and Variable are more resistant to extinction