Operant conditioning Flashcards
What is Thorndike’s law of effect?
If, in a specific situation, a response is followed by a reinforcer, the response will become associated with that situation and will be more likely to occur again in that situation.
What is operant conditioning?
The organism operates on its environment in some way to achieve some desirable outcome
Behaviour is associated with consequences
What are 3 Key features of Skinner’s Operant Box?
• Some behaviour that can be done to obtain reward.
―Rate measured by experimenter
• A dispenser of food or liquid used as a reinforcer (reward)
• Tones or lights to signal availability of opportunity for reward or pending punishment
―Used in discrimination and generalisation studies
What is shaping?
• Shaping is the use of reinforcement of successive approximations of a desired behaviour.
• Specifically, when using a shaping technique, each approximate desired behaviour that is demonstrated is reinforced, while behaviours that are not approximations of the desired behaviour are not reinforced
Incrementally build towards a behaviour (step by step)
You know the end pt but there are lots of actions that need to happen in order to get there so reward each step chronologically
What is positive reinforcement?
Smtg added to the env causes behaviour to aug in f
What is positive punishment?
Something is added to the environment, that causes the behaviour to decrease in frequency ∴ that something must have been
unpleasant
What is negative punishment?
Something is removed from the environment, that causes the behaviour to decrease in frequency ∴ that something must have been pleasant
AKA Response Cost or Omission Training – but regardless of name – they all involve the removal of a stimulus, following the targeted behaviour, that the person values/desires/enjoys.
To facilitate the process they may be reinforced for exhibiting another more desirable behaviour (DRO: Differential Reinforcement of Other behaviour)
If the person makes the “wrong” response then they will lose something of value
So they should learn to inhibit or omit the “wrong” behaviour (omission learning).
What is negative reinforcement?
Something is removed from the environment, that causes the behaviour to increase in frequency ∴ that something must have been unpleasant
Smtg neg removed from the env increases the behaviour that allowed us to avoid the neg (applying sunscreen)
How do different types of reinforcement interact with different emotions?
- Happiness: Positive Reinforcement; Application of Pleasant Stimulus
- Anger: Omission Learning; Removal of a Pleasant Stimulus = Negative punishment
- Relief: Negative Reinforcement; Removal of an Unpleasant Stimulus
- Fear: Positive Punishment; Application of Unpleasant Stimulus
What is a continuous schedule of reinforcement?
- Behaviour is followed by a consequence each time it occurs
- Excellent for getting a new behaviour started
- Behaviour stops quickly when reinforcement stops
- Schedule of choice for punishment and time-out
What is thinning intermittent reinforcement?
• One of two methods commonly used:
―Gradually increasing the response ratio or the duration of the time interval between Response –> Reinforcer
Response ratio = how many times have to respond to get a reward
Can change behaviour to get response ratio we want
Time interval = no matter how many times they respond, only get reward at exact time
— Providing instructions such as rules, directions and signs to communicate the schedule of reinforcement.
i.e. give a cue/signal that Reinforcement is on its way
What are 4 partial schedules used for resistance to extiction?
• Ratio Schedules: (Responses/actions)
• e.g. after the pre-determined number of responses has
been made –> outcome
• Interval Schedules: (Time lapse)
• e.g. the 1st response after the specified time has elapsed
–> outcome
• Fixed Schedules: (set rate/time)
• e.g., every 5 responses (ratio) or every 5 mins (interval) –>
outcome
• i.e., a predictable schedule
• Variable Schedules: (random average)
• E.g., every 2 - 5 responses (ratio) or every 2 - 5 mins
(interval) –> outcome
• i.e., an unpredictable schedule
Combinations: • Fixed-Ratio • Variable-Ratio • Fixed-Interval • Variable-Interval
What is a fixed ratio schedule?
Same ratio continues all throughout
• Behaviour/reinforcement (100/1 or 15/1)
• Response Rate: (Higher ratio = faster responding)
• Behaviour: tend to work hard (Ratio run); receive reinforcement; then brief postreinforcement pause then work hard
Resistance to Extinction: Low
• High rates of responding –> pause after receiving reward (PRP) –> then onwards for the next reward
• Make the number of responses too high –> ratio strain
a disruption in responding due to an overly demanding response requirement
Note also the closer they get to their target # of responses – so the rate of bar pressing increases – known as a ratio run
What is ratio strain?
― A result of abrupt increases in ratio requirements
― Characteristics include: avoidance, aggression, and unpredictable pauses in responding
― Ratio strain is the point of too much energy expended in exchange for too little in return.
What is the goal gradient hypothesis?
Animals in traversing a maze will move at a progressively more rapid pace as the goal is approached