Instrumental conditioning Flashcards
What did Thorndike and his puzzle boxes demonstrate?
That animals learn a response to get access to a reinforcer
Learning achieved through simple trial and error, with no evidence of any active understanding of how to solve the problem
Learn from consequences of actions
What was Thorndike’s “law of effect”?
Any behaviour followed by pleasant consequences is likely to be repeated and any behaviour followed by negative consequences is likely to be stopped
How does operant conditioning resemble/differ from classical conditioning?
Similarities - Basic principles e.g. acquisition, extinction, spontaneous recovery and stimulus generalisation
Differences - instrumental involves animal having to DO something to get reinforcement i.e. animal has control, voluntary responses
What are the 3 basic concepts within instrumental conditioning?
R - instrumental response e.g. lever press
rF - reinforcement e.g. access to food
SD - discriminative stimulus i.e. stimulus that informs animal about availability of reinforcement
What happens in operant conditioning experiments using the skinner box?
Rat needs to be allowed to perform the lever press to obtain food
As trials continue, number of presses per minute steadily increase as lever press becomes more strongly associated with food reward
What does the law of effect suggest regarding what animals actually learn in operant conditioning experiments?
By associative learning theory, we have a node that corresponds to the discriminative stimulus i.e. lever, and a node for response (lever press) and reinforcement (food)
Law of effect would suggest association only exists between SD and R i.e. lever and lever press, with reinforcement having modulatory effect aiding establishment of association (initial reinforcement attracts animal close to lever)
Animal learns to press lever through trial and error and learning from consequences but cannot predict consequences of behaviour because no link between response and rF or between SD and rF
How did Adams and Dickinson test this theory?
PHASE 1 - both exp and control group receive same treatment, presented with lever which released food when pressed so SD-R association learned
PHASE 2 - Devaluation of reinforcer via induction of illness; exp group received food and illness close together so association formed; tests whether the food actually has impact on SD-R association
Law of effect would predict that devaluing the reinforcer in this way will not affect results of test - SD-R relationship should still exist and both groups should still perform the same
What did Adams and Dickinson find?
Animals in exp group stopped responding to large extent while those in control continued high rate of pressing
Challenges law of effect, demonstrating that reinforcement is part of the association and we need to account for its value in understanding how learning occurs
How do the behaviourist movement and the modern learning theory differ?
Behaviourism - only associations between stimuli and responses established during learning episodes
Modern - more interactive relationship wherein associations may develop between different stimuli, responses and outcomes (reinforcers) in standard conditioning tasks
What are the different procedures used to develop understanding of operant conditioning?
Positive procedures - response produces presentation of an event, can be reinforcement e.g. food, or punishment e.g shock
Negative procedures - Response terminates presentation of event, can be omission e.g. no food (aversive), or escape/avoidance e.g. avoiding a shock(appetitive)
How else can the procedures be classified?
Reinforcement - Increase response rate; can be positive i.e. using food to encourage lever press, or can be negative i.e. learning that pressing lever allows escape and avoidance of shock
Punishment - Decrease response rate; can be positive e.g. giving shock when press lever thus discouraging further pressing, or negative i.e. something appetitive being taken away (common strategy when children misbehave)
What us the conditioned emotional response procedure
STEP 1 - Training in instrumental task (lever press –> food)
STEP 2 - Classical conditioning training once lever associated with food (tone–> shock) - shock is aversive and is the US, interrupting the lever pressing (the UR); CR is when tone (CS) alone interrupts lever pressing
What do we need to take care with in order for the CER procedure to work correctly?
How we conduct step 1 - if animal gets food reward every time presses lever, in short time it will be satiated and no longer motivated to press lever - at this point no classical conditioning will be observable
To avoid this we need to limit number of rewards during a session - has been shown to effectively increase response rate for long periods
What is meant by an interval schedule of reinforcement?
Present a reward every then and again e.g. every minute
What is meant by a fixed interval schedule?
Reward for responding after a fixed period of time since last reward - during the interval any further lever presses have no effect but once interval up as soon as lever pressed the reward will be received. Animals somehow learn about time and we see responses only after the intervals; responding is not uniform