instrumental learning Flashcards
What are the instrumental conditioning procedures
Positive reinforcement = R->appetitive (more R)
Punishment= R-> aversive (less R)
Negative reinforcement = R->no aversive (more R)
Omission Training= R->no appetitive (less R)
What is Thorndike’s law of effect
animals repeat actions that lead to a satisfying state of affairs, and this is called reinforcement
what does Hull believe reinforcement is down to
reinforcement is due to drive reduction, hence the animal will work for food if it is hungry, or for water if it is thirsty etc.
simple schedules and their effects
- Continuous reinforcement, CRF – reinforce every response
- Fixed ratio, FR – reinforce every nth response. Pause after each reinforcement followed by fast responding
- Do this because after a while, the animal will be full
- Variable ratio, VR – reinforce every nth response on average. Continuous fast responding
- Fixed interval, FI – reinforce the first response after time t has elapsed since the last reinforcer. Pause after each reinforcement followed by gradually increasing response rate
- After 30 seconds, the first reponse will give you a pellet
- As it gets closer to the thirty second deadline, animals get excited
- Variable interval, VI – same as FI but with a variable time period. Continuous moderate response rate
how do we know an animal learns an action rather than an association
Telling the hamster to go the other way would prove the hamster knows the action and not the association.
With this bidireciton of control, the hamster can change direction. He has learnt the action. Instrumental learning is here
what happened when animals were overtrained
Here, the devalued group press the lever a bit more than the nondevalued.
They press the lever but would leave the reward. This is a habitual automatic reflect.
The overtrained animals are exhibiting what Adams and Dickinson called a habit, something that an S->R account would expect, where the current outcome value has no impact on the probability of making a response in the presence of the discriminative stimulus. But what about the other group of animals?
what did Colwill & Rescorla 1990 do
- Put a rat in a box
- It can press a button or pull a lever
- If they pull a chain they get food, lever is sugar water
- Which outcome you get is dependent on the stimulus you receive (light of tone)
- Pair the sugar water then with the lithium chloride
- Testing them at extinction, they wont press the lever
- Their action is dependent on the outcome they want to be desired
Tony Dickinson suggested two kinds of instrumental learning;
habits and actions
What is the castaway dilemma
In this, someone who is castaway on a desert island is hungry but manages to find and eat coconuts. Then they become thirsty and there’s no water available - what do they do?
The answer is pretty obvious - they drink coconut milk - but would an animal have the ability to learn this?
What did Dawson and Dickinson find originaly
They (Dawson and Dickinson) found no difference in performance of the two actions
no castaway dilemma
what did dickson eventually find
If the animal doesn’t know that the sugar water is good, then they wont solve the problems. They;; find that sugar water is a better option.
Then, they can solve this dilemma.
If they have no experience, there is no result
What is the model of instrumental performance
- Tony Dickinson has argued for a model of instrumental performance that requires inference on the basis of these results. Thus the animal is postulated to reason that:
- I’m thirsty
- If I pull the chain I get sugar water
- Sugar water is good when I’m thirsty
- I’ll pull the chain then.