Learning Theories Flashcards
What is Classical Conditioning?
Involves naturally occurring stimulus that will automatically elicit response in organism. Food is an unconditioned stimulus (Produces unlearned, natural response) and the salivation in response to food is unconditioned response (Reflex action, no learning occurs to do action). A neutral stimulus (Doesn’t produce response yet) such as sound of opening cat food must be paired with unconditioned stimulus to produce response.
If Neutral stimulus and Unconditioned stimulus are repeatedly paired with one another, an association is formed (Cat hears can, gets food, and responds by salivation) . The neutral stimulus now becomes the conditioned stimulus (CS) and Unconditioned response (UCR) becomes conditioned response (CR) which is behaviour that is shown in response to learned stimulus. Therefore, the cat salivates when it hears sound of opening can.
In Classical conditioning, there’s a tendency for conditioned stimulus (CS) to produce same behaviour to similar stimulus after response has been conditioned. Using the cat example, a cat may salivate over any can being opened in the kitchen- Generalisation suggests that the stimulus triggering a reaction doesn’t have to be exact one involved in process of learning, but the more similar it is, the more likely it is to produce a conditioned response. Discrimination can occur, meaning that, over a period, learning only occurs in response to specific stimulus. For example, that your cat may only respond to can opening at certain time of day or only responds to tin of food but not glass jar.
What is Extinction and Spontaneous Recovery?
Extinction suggests removal of behaviour. Thinking of our cat example, conditioned stimulus (Sound of can opening) is continually presented with food being paired with it, the cat gradually learns to disassociate the two stimuli and won’t salivate upon sound of can being opened.
If tin is paired with food following extinction, the cat will quickly learn to associate food with tin. This accelerated form of learning is known as spontaneous recovery and means that extinction isn’t the same as unlearning, as the response has disappeared and isn’t completely eradicated.
What was Pavlov’s Experiment?
When dog encounters stimulus of food, saliva starts to pour from salivary glands while carrying out his experiments, Pavlov was studying reflex reactions as he observed that dogs drooled and produced saliva without proper stimulus. Pavlov concluded that environmental stimuli unrelated to reflex action (Sound of metronome) could learn through repeated pairings, trigger salivation effect and that, through process of conditioning, conditioned stimulus leads to conditioned response.
In a sequence of experiments, Pavlov then tried to establish how the two phenomena were linked. Pavlov created soundproof lab to see if presentation of precise stimuli would evoke response in conditions that ensured no direct contact between dogs and experimenter. Pavlov knew that food (UCS) would lead to salivation (UCR). Pavlov then used neutral stimulus which wouldn’t elicit response, i.e., metronome. Over several learning trials, dog was presented with metronome ticking immediately before food was given. If metronome ticking was in close association with their meal, the dogs learned to associate sound of metronome with food, after a while, sound of metronome (Conditioned stimulus) elicited drooling (Conditioned response). Pavlov concluded that environmental stimuli unrelated to reflex action (Sound of metronome) could learn through repeated pairings, trigger salivation effect and that, through process of conditioning, conditioned stimulus leads to conditioned response.
Pavlov worked to establish reliability of findings. He set out to see if same system of learning would work with neutral stimuli, i.e. presentation of vanilla odour and visual test involving rotating disk before food was given. Pavlov went on to pair further neutral stimulus with conditioned stimulus i.e. shape or colour (CS2) with sound of metronome (CS1) and found higher order conditioning was possible. He found that dogs showed stimulus generalisation to sounds of similar tone, but were able to discriminate between sounds that were of different tone. The more similarity there was between the new neutral stimulus and conditioned stimulus, the greater the amount of drooling from a dog.
What is Operant Conditioning?
Involves learning through consequence (Outcome which follows behaviour). Through operant conditioning, an association is made between behaviour and consequence. If we get punished for behaviour, we’re less likely to repeat behaviour in future. However, if we show a behaviour that is followed by positive reinforcement (Praise or physical reward) it is likely that behaviour will be repeated.
For example, when lab pigeon taps blue button with beak, it receives food pellet as a reward. However, when the pigeon taps a red button, it receives mild electric shock. As a result of learning these consequences, the pigeon learns to press blue button (Positive Re-enforcement) but avoid red button (Negative re-enforcement).
Edward Thorndike (1911) labelled this form of learning instrumental learning. His research involved what he called the puzzle box. This was a box in which he placed a kitten, it had to solve a puzzle to escape box to create food reward. Initially, he observed that kitten would randomly climb around box, then accidentally hit latch to open door, when this was done, cat was given food. After several learning trials, cat escaped faster, so the kitten had learned by trial and error that finding + Opening the latch meant a food reward. Thorndike termed, this the law of effect (Response followed by pleasant consequence i.e., reward tends to be repeated while one with unpleasant consequence i.e., punishment won’t be repeated). Moreover, according to Thorndike’s Law of Exercise, all things being equal, the more often that a response i.e., Pain due to shock is performed in given situation, the more likely it is to be repeated .
What did Skinner do with Operant Conditioning?
B.F. Skinner renamed instrumental conditioning as operant conditioning. Skinner was a true scientist and felt that he couldn’t study something that wasn’t directly observable, such as the mind. He believed that, to understand human behaviour, it was necessary to apply scientific principles and methods. He believed the idea operant conditioning was more appropriate as, with this form of learning, you’re operating on or being influenced by environment.
Skinner began research in 1930’s using lab experiments with Skinner Box which could dispense food or deliver shocks to animals i.e., rats or pigeons. Skinner created ABC model of operant conditioning to explain learning:
Antecedent- Present stimuli (Lights/ Noise) that will trigger behaviour.
Behaviour- Response made by the animal that can be observed/measured as an outcome of antecedent.
Consequence- Reward (Food) /Punishment (Shock) for following behaviour,
The stimulus-response association is only repeated or learned if consequence of pairing is positive, negative consequence will weaken the stimulus-response link.
Therefore, if rat of pigeon is given something pleasurable i.e., food pellet following desired behaviour (Lever pressing), they’re more likely to repeat behaviour in future, this is known as positive re-enforcement. On the other hand, negative reinforcement is removal of something unpleasant in response to the desired behaviour. This will increase the likelihood of behaviour being repeated, to avoid unpleasant stimulus. Therefore, if a rat or pigeon is given electric shocks until lever is pressed, they’re more likely to avoid electric shock in future.
In summary, both positive and negative reinforcement produces repeated behaviour. Punishment, on the other hand, weakens the behaviour by either presenting something unpleasant/painful whenever behaviour is shown such as rats being shocked for pressing lever (Positive Punishment) or removing pleasant/desirable stimuli when behaviour is shown like diverting attention (desired response) from dog who jumps, (Negative Reinforcement) . Both of which, should reduce negative behaviour i.e rat lever pressing and dog jumping.
What are Types of Reinforcers?
In operant conditioning, there’s two types of reinforcers that increase the likelihood of behaviour being learned. Primary reinforcers occur naturally and satisfy basic human needs such as food, water, and shelter. Secondary reinforcers, on the other hand, strengthen the behaviour because they’re associated with primary reinforcer, i.e., money can be used to buy food, accommodation, clothing, and so on.
What are Schedules of Reinforcement?
When and how often you reinforce behaviour can have a large impact on strength and likelihood of behavioural response. A schedule of reinforcement is a “Rule” which dictates the situations in which a behaviour will be re-enforced. It is quite possible that, in some situations a behaviour may be reinforced each time it is seen (Continuous reinforcement), although in day-to-day life, behaviour may only be re-enforced some of the time (Partial Reinforcement). Behaviour acquired through partial reinforcement may take longer to learn and is more resistant to extinction.
The four schedules of partial reinforcement are:
Fixed Interval- Rewarding of first, correct response only after pre-set amount of time has passed, For example, rat in Skinner Box gets food pellet for pressing lever only after 30 second time delay. Learning takes longer, but response rate (the number of responses that occur within a specified time interval.)of animal is higher towards end of learning. There’s a scalloping effect (Dramatic drop off in response immediately after reinforcement).
For example, rats had to wait 30 seconds before getting food pellet.
Variable Interval- Rewarding of first correct response after a set amount of time has passed, after which, a new time period is set. Learning is still noticeable and scalloping effect noticeable in fixed interval re-enforcement isn’t seen here.
For example, Rat may have to wait 30 seconds, and then a minute.
Fixed ratio- Where response is reinforced only after specified number of responses.
For example, providing food pellet to rat after pressing lever eight times.
Variable ratio- Response may be reinforced after set number of correct responses is given. After this has been achieved, the number of correct responses for reinforcement to be given, changes. Skinner argued this schedule is good for maintaining behaviour.
For example, Rat may be given food after eight lever presses, and then sixteen lever presses.
What is Behaviour Modification?
Modification (Including Shaping Behaviour)
Behaviour modification- Changing behaviour gradually/over time
Ideas are to: Extinguish undesirable behaviour by removing reinforcer, replace original behaviour with desirable one and include reinforcement.
Skinner developed idea to include “Methods of successive approximation”. At start, general behaviours rewarded, then more specific behaviours rewarded, and behaviour is rewarded the closer it gets to desired behaviour change (Gradual).
Used in numerous contexts- Therapy for ADHD, OCD, and Autism. Target behaviour identified and rewards given for behaviour that gradually gets closer to target. For example, therapists may use rewards to encourage specific behaviour in child.
What is Social Learning Theory?
Social learning theory explains behaviour as learning through observation and is attributed to work of Bandura. The belief is that humans and animals learn by observing the world around them and imitating or copying the behaviour. Individuals that are observed are called models.
Mineka and Cook (1988) observed rhesus monkeys raised in captivity, who originally showed no fear of snakes, but did show alarm after watching anxious reactions of wild monkeys in presence of snakes. Similarly, children are surrounded by many role models (Significant individuals in person’s life) such as parents, peers, teachers. Daily, these models provide examples of behaviour for children to observe and replicate.
Behaviour is more likely to be copied if observer can identify with role model and observed behaviour in some way. Effective role models re typically the same sex as observer and/or can be admired for higher status/power. Similarly, an observer is more likely to reproduce the model’s behaviour if the consequences are rewarding, rather than resulting in punishment for the role model. For example, if a younger sibling is watching their older sibling eat lunch and get praised of using knife and fork, they’re more likely to copy behaviour. They are unlikely to copy eating behaviour that has previously been punished. For example, older sibling eating luch with mouth open. This process is known as vicarious reinforcement- Learning through consequences of another person’s behaviour.
What are the Stages of Social Learning Theory?
Attention- One of required conditions for effective modelling (Way of learning through imitation) was attention. Illustrates clear, cognitive element within his theory and one that could result in behaviour being copied or not. Attention is necessary for learning, however, this depends on many factors such as distinctiveness of behaviour being modelled and factors within person observing model, such as level of arousal.
Retention- Must retain/store what they have attended to, could use language and imagery to assist retainment. Humans store the behaviours they observe in the form of mental images or verbal descriptors and can recall these in behavioural reproduction.
Reproduction- Showing the modelled behaviour, reproducing what has been observed. Bandura made clear that factors such as physical capabilities of individual as well as self-observation of reproduction affected showing of behaviour. If the behaviour is beyond our capabilities, then it cannot be reproduced.
Motivation- The final process refers to incentive, if a reward is offered, we’re more likely to reproduce the behaviour.
Intrinsic motivation refers to inherent satisfaction rather than physical outcome, i.e. a girl may imitate mom’s behaviour as it makes her feel more like her mom.
Extrinsic motivation refers to something tangible, like sportsperson receiving trophy or medal
Vicarious reinforcement is a form of motivation that doesn’t directly reward individual themselves. For example, a child could witness another child showing good behaviour and getting a reward and , even though observing child doesn’t get reward they think “If I act like that, I could get reward too”.
What is the Aim and Sample of Bobo Doll study?
Aim- To investigate where exposure to aggression would influence behaviour.
Sample- 72 children from Stanford University Nursery School: 36 boys + 36 Girls with mean age of 52 months. Children split into eight experimental groups (Six in each) and control group of 24 children. ½ children in experimental groups observed aggressive role model, other ½ observed nonaggressive role model. Bandura then splits groups again so ½ the subjects of non-aggressive and aggressive condition saw same-sex role model and other ½ didn’t.
Control group didn’t experience role model prescience and behaviour would be observed when children were allowed to play with toys in final condition.
In order to control for baseline levels of aggression (Physical, verbal, aggression inhibition and aggression towards inanimate objects) participants were rated on each of these characteristics in four, separate 5-point scales and then children in each group were matched for aggression so groups were similar.
What is the Procedure of Bobo Doll study?
Child brought into room and was taught how to play with various toys like Bobo doll and tinker toys, nicely
Child placed in room with model and experimenter, child sat in one corner, model stood at other. After some time with child, experimenter left, and child observed model where they played normally with toys at first then:
In aggressive condition, model made distinctive, aggressive acts to doll (Kicked, Punched, Threw, Hit doll with mallet) repeated 3 times with verbal statements like Pow to direct adult.
In non-aggressive condition- Model played with other toys and ignored doll
After 10 minutes, child sat in room for 2 minutes, played with toys until told they weren’t for them but were for other children to provoke mild aggression arousal
In next room, child spent 20 minutes with toys (Categorised as aggressive and non-aggressive for experiment) being observed through one-way mirror (Covert) through interval sampling (Behaviour observed at regular time interval)
Three types of imitative behaviour used to score participant behaviour: Imitative verbal aggression- “Pow, Hit him down” (Imitating adult), Imitative non-aggressive verbal statements- “He really is a tough fella”, imitative physical aggression- Punching and Kicking and throwing doll.
Other categories like Mallet Aggression- Mallet used to hit other objects, Acts of non-imitative physical or verbal aggression- Not modelled to children like aggression to other objects and statements like “Shoot the bobo”, Aggressive gun play- Child aims gun at other objects and “Shoots”
Observations of non-aggressive behaviour (Sitting quietly) also made
What are the results and conclusions of Bobo Doll study?
Results (Key Observations)- Participants in aggressive model condition tended to display a lot more physically and verbally aggressive acts.
Imitations wasn’t just linked to aggressive acts, 1/3 of ppts in aggressive condition demonstrated non-aggressive verbal statements.
Ppts in aggressive condition were more likely to display non-imitative aggression.
Partial imitation of model’s behaviour, as illustrated, for example, use of mallet was significantly different between conditions. Also sitting on the Bobo doll was significantly more common in the aggressive condition compared to non-aggressive and control conditions.
Conclusions- Bandura concluded that, if child was exposed to an aggressive model, it was likely that they’d imitate their behaviour. Boys more likely than girls to imitate same-sex role model.
Evaluate Bandura Bobo Doll Study
Sample was 72, large enough that anomalies such as disturbed children may be cancelled out, increasing generalisability of results. However, the children may have unusual home lives, and highly educated parents as they were taken from university nursery, making them unrepresentative of normal children and sample unrepresentative of children from unique backgrounds or with less educated parents, meaning results regarding imitating model behaviour cannot be generalised to people from these backgrounds .
Method 1- Two observers were used behind one-way mirror, which creates inter-rater reliability of results as behaviour had to be noted by both researchers or it didn’t count, however observations can be subjective, and reliability of results can be questioned because of this.
Method 2-The bobo doll’s purpose is to be hit and knocked over, so children may have felt encouraged to act unnaturally and show physically aggressive behaviour to doll (Demand characteristics) which lowers internal validity of results, however children were matched on baseline levels of aggression in aggressive and non-aggressive group which increases causal relationship between observing aggressive model and exhibiting aggression, increasing internal validity of results.
Conclusions- Bandura concluded that, if child was exposed to an aggressive model, it was likely that they’d imitate their behaviour. It suggests children observe and imitate adults, so if you want your children to grow up calm and well-behaved, you need to keep your temper and keep them away from aggressive role models. Calm role models seem to have a big effect, which might apply to “buddy” systems used in schools or prisons to help troubled students or prisoners learn from a role model.
To conclude, research is strong in the fact that the results have high inter-rater reliability and conclusions are applicable to reducing aggressive behaviour in education and through parenting, however results are limited in generalisability (Unrepresentative Sample) and Internal validity (Demand characteristics).
What is the Aim and Sample of Bandura (1965) Bobo doll experiment with vicarious reinforcement?
Aim- Unlike the original Bobo Doll study, in this variation. Bandura arranged for children to watch televised model to exhibit novel, verbal and physically aggressive behaviour to investigate whether children would be more aggressive when they viewed a model rewarded for their aggression.
Sample- 33 Male and 33 Female participants from Stanford University School were randomly allocated to 1/3 conditions:
* Model rewarded for Aggressive behaviour.
* Model Punished for aggressive behaviour.
* No consequences (Control)
What is the procedure of Bandura (1965) Bobo doll experiment with vicarious reinforcement?
Children followed researcher into a room. They were told that, before they could go to a surprise playroom, they would have to wait while experimenter dealt with business, while they waited, they may want to watch television (Which showed 5-Minute-Long programme where model exhibited aggressive behaviour). Depending on the condition, model was rewarded, punished or, in control, there was no response at all to their condition. In the film, initially, the model walked up to bobo doll and asked it to clear the way. Model showed four distinctive aggressive responses along with verbal statements (Not considered to be in child’s normal, verbal repertoire)
Model put bobo doll on side and sat on it, punching nose and saying “Pow, right in the nose, boom”.
Bobo doll was allowed to come back up again before model hit it on head with mallet, accompanied with statement “Sockeroo…stay down!”
Model kicked doll around room and said, “Fly away”.
Model threw rubber balls and would shout “Bang” every time one hit.
Order of behaviour repeated twice during programme.
In closing scene of programme, model was either rewarded, punished, or nothing happened.
Model rewarded condition- Second adult approached model with soft drink and sweets. Adult then stated to model that he was “Strong Champion” and that aggressive behaviour was seen as deserving “Considerable treats”. While model was eating and drinking, second adult made further comments that positively reinforced aggressive behaviour.
Model Punished Condition- Second adult approached model shaking finger disapprovingly, stating “You big bully, you quit picking on that clown. I won’t tolerate it”. As model drew back from second adult, he tripped and fell. The second adult sat on model, hit him with rolled paper and reminded him of how bad his aggressive behaviour was. Model then ran cowering and second adult said “If I catch you doing that again, you big bully, I’ll give you a hard spanking. You quit acting that way.
No consequence condition (Control)- Closing scene of same film included no form of reinforcement.
Following exposure to closing scene, ppt taken to other room. In it, was Bobo doll, mallet, three balls, peg board, dart guns, plastic farm animals and doll house with dolls + Furniture.
For total of 5-10 minutes, children were observed with behaviour being recorded every 5 seconds. Two observers recorded observations but neither had any knowledge of which condition children were assigned to.
What are the results and Conclusions of Bandura (1965) Bobo doll experiment with vicarious reinforcement?
Results- Bandura’s results showed that children were more likely to imitate aggressive behaviour if the model was positively rewarded. Bandura’s belief that boys would perform more imitated responses than girls were also supported.’
Conclusion-Children imitated novel verbal and physical behaviour shown by a televised role model
Vicarious reinforcement took place and has significant role in children learning aggressive behaviour