learning and cond lect 2 operant cond Flashcards
learning curve (Thorndike)
-how long it takes cats to escape puzzle box to access food on outside
-inside box was lever that would open box
-same cat in box, cat learns lever opens box to access food
-time for cat to open the box was plotted on a graph into a learning curve
what does thorndikes learning curve show us
-responses ending in satisfactory outcome are more likely to be repeated
-discomforting responses are less likely to be repeated
what is thorndikes law of effect
-first formal statement of importance of consequences in learning
-established the beginning of operant conditioning even if not named at this stage
3 term contingency
ABCs of behaviour
-A= antecedent
-B = behaviour (what people say/ do, not attitudes or feelings)
-C = consequence
what are behaviour characteristics
they can have 1 or more dimensions, can be observed/described/recorded, have an impact on envrionment, may be overt or covert
effect of consequences on behaviour
e.g you get a higher mark on your essay for critical evaluations so you do this in your next essay
-reinforcement: after behaviour increases probability of future similar behaviours
-punishment: decreases probability of future similar behaviours
what are the 4 basic consequences
pos reinforcement: something added to increase likelihood of future behaviour
neg reinforcement: something removed to increase likelihood of future behaviour
pos punishment: something added to reduce likelihood of future behaviour
neg punishment: something removed to decrease likelihood of future behaviour
SD =
discriminative stimulus (antecedent)
SR + =
pos reinforcement
role of motivation in reinforcement
-motivation is an establishing operation (EO)
-we expect a behaviour when we have an EO
e.g
EO (thirsty) - SD (tap marked with C for cold water) - response (turn tap on with C) - SR+ (cold water presented)
superstitious behaviour
skinner: you can accidently cond superstitious behaviour in animals
-when reinforcement e.g food pellet accidently follows a behaviour that did not produce the reinforcement, animals can be cond for superstitious behaviours
e.g sports players sometimes wear lucky socks as they won a game wearing these once or twice
skinner research
-pigeons provided with reinforcement at regular intervals with no reference to birds behaviour
-if bird happens to be executing a particular response at same time, they tend to repeat this behaviour
-if interval is not too long that the pigeon would have forgotten, it does the same thing again and by coincidence food pellet is given again (leads to accidental reinforcement of superstitious behaviour)
escape contingency
-in escape contingency a response terminates a stimulus that is present e.g sun in eyes so move hand to cover sun, glare of sun escaped
avoidance contingency
-response prevents or postpones presentation of stimulus e.g hear a friend you dislike coming so look down so disliked friend is avoided
similarity between pos and neg reinforcement
increase in response via stimulus change
diff between pos and neg reinforcement
-pos produces previously absent stimulus
-neg removes stimulus present prior to behaviour
factors effecting reinforcement
-timing between stim and response (best if stim given straight after behaviour)
-reinforcement is regular after behaviour occurs
-reinforcement needs to be specific to desired behaviour
-for humans a verbal description for which behaviour is being reinforced is helpful
conditioning and awareness
bradshaw and reed
-pos reinforcement, schedules of behaviour and awareness
-over 10 experiments
-computer task: press button to earn points, told to find best way to earn points
-ratio schedule: points earned by pressing quickly
-interval schedule: most efficient way was to wait 10 secs then press button
-pp given questionnaire asking what was best way to score points
findings of bradshaw and reed cond and awareness study
-those who scored most points were aware of how to score these points
-clear relationship with contingency awareness
-performance on schedules of reinforcement sig related to awareness of performance which produced reinforcement
applying pos reinforcement
-tell learner about programme at outset
-describe desired behaviour
-use lots of praise and physical contact
-can gradually fade reinforcers once consistent e.g praise every 2
nature of reinforcement
it is NOT the case that some behaviours are more susceptable to reinforcement than others
using reinforcement effectively
- use high quality reinforcement
- set easily achievable initial criterion
- explain contingency and provide prompts
4.deliver reinforcer immediately after behaviour - initially reinforce every occurence of behaviour
- gradually increase response to reinforcement delay
- use varied reinforcers
- shift from contrived to naturally occuring reinforcers
strengths of pos/neg reinforcement
+ abundance of evidence for effectiveness (skinner)
+ relatively ethical compared to punishment
+ can help indiv and society to develop good pos practices
+ can be effective in LT
weaknesses of pos/neg reinforcement
- occasional chance of reinforcing non target behaviour
- can be used unethically
- can be lengthy to implement
- important to get the reinforcer right
- not useful to prevent harmful behaviours rapidly
what is extinction
-reinforcement stops occurring and behaviour ceases
williams 1959
-extinction of night time tantrums
-not giving child any attention/reinforcement
-tantrum reduced/ length of crying reduced after a few nights of no attention
lovaas and simmons
-extinction of self injurious behaviour
-children hitting head against wall/ with hand etc
-extinction used on 2 children, punishment used on 1
-1 childs behaviour who experienced extinction improved, the other didnt
-punishment may not prevent behaviour as it is giving child more attention
side effects of extinction
-behaviour may increase in frequency, duration or intensity initially (extinction burst)/ resist extinction
-novel behaviours may occur, diff behaviours tried
-emotional response/ agg behaviour may occur
misconception about extinction
-doesnt mean simply ignoring behaviour, unless attention is the reinforcer
using extinction to reduce behaviour
-you have to know what the reinforcer is or to no longer remove a good stimulus after the behaviour
-must be consistent to work effectively
-using schedule of reinforcement and begin process of extinction (can take lots of time)
-can work better if another target behaviour is reinforced at same time as replacement
strength of extinction
+far gentler and more ethical than punishment
weaknesses of extinction
-can take time if behaviour is deeply entrenched
-if you cant follow through completely you risk behaviour returning stronger than ever
-not always apparent what the reinforcer is and how to remove it
punishment
-removes/reduces behaviour
-doesnt need to be aversive
ST contingency
antecedent - behaviour - consequence
-uncond punisher is stimulus whose presentation functions as punishment without being paired with any other punishers e.g pain, certain odours, tastes
Saj Waj et al research
decreased life threatening rumination in 6 month old infant
-contingent delivery of small drop of lemon juice every time baby ruminates (pos punishment)
Luce et al
-decreased agg in child by telling them to stand up and sit down 10 times every time they hit someone
-hitting reduced
what is overcorrection (pos punishment)
client required to engage in effortful behaviour directly linked to problem behaviour in extended period (foxx and azrin)
what is restitutional (pos punishment)
client must correct environmental effects of problem behaviour and restore natural environment
what is positive practice (pos punishment)
client must engage in correct forms of relevant behaviour
what is time out (neg punishment)
-remove person from situation
what is response cost (neg punishment)
removal of specified amount of a reinforcer contingent on occurrence of a problem behaviour e.g loss of play time in mins
what are the factors influencing the power of a stimulus to act as a punisher/reinforcer
-contingency (consistency)
-immediacy
-magnitude
what are the risks with punishment
-emotional reactions
-escape and avoidance of punishment/punisher
-neg reinforcing punishers behaviour
-neg modelling e.g children learning the wrong behaviours
-fails to teach an appropriate replacement behaviour
punishment in everyday life
in schools, law, friends, family, apps, everyday services e.g gym
strengths of punishment
+stops behaviour quickly
+easy
+sometimes necessary to prevent harmful behaviours
weaknesses of punishment
-potential for unethical use
-shouldnt be used as an easy option if extinction is an option
-doesnt indicate to subject which behaviours are desirable, only those undesirable
-punisher may neg reinforce their behaviour
what is continuous reinforcement (SCHEDULES OF REINFORCEMENT)
CRF
-provides reinforcement for every occurrence of a behaviour
-advantageous for gaining a new skill
intermittent schedules of reinforcement (INT)
-based on rule which specifies when a reinforcer is delivered
-used to strengthen established behaviours
-usually necessary for the progression to naturally occurring reinforcement
example of CRF and INT
CRF - vending machine
INT - gambling machine
intermittent pos reinforcement schedules
-skinner: schedules of reinforcement can be used in place of continuous reinforcement with reliable effects on patterns of behav
what are ratio schedules (intermittent pos reinforcement schedules)
reinforcement delivered based on no. of responses emitted
what are interval schedules (intermittent pos reinforcement schedules)
reinforcement delivered based on first response emitted after specified amount of time
how can both ratio and interval schedules be subdivided
into fixed (exact values) or variable (changing value) schedules
what are the 4 main schedules for reinforcement
1.fixed ratio
2.variable ratio
3.fixed interval
4.variable interval
fixed schedules
-response ratio or time requirement remains constant
-fixed ratio: reinforcement after specified no of responses e.g fixed ratio 4 = after every 4th correct response
-fixed interval: reinforcement after 1st response followed by time elapse e.g reinforcement 2 mins after 1st response
variable schedules
-response ratio or time requirement can change from 1 reinforced response to another
-variable ratio: reinforcement after an average no of responses across schedule e.g variable ratio 4 = reinforcement after an average of every 4th occurrence
-variable interval: reinforcement following first reponse after an average amount of time has elapsed e.g variable interval 2 mins = reinforcing first occurrence after an averaged elapsed time of 2 mins