Lecture 4 Flashcards
operant vs classical
classical relies on reflexive assocaitons and is for involuntary responses
operant is based on consequences and influences tendency of voluntary behaviour
operant conditioning
reward = increase in behaviour
punishment = decrease
Thorndike box
cat put in box and escapes by pulling string and turning latch
law of effect
tendency to perform action is increased if rewarded and weakened if not
Skinner box and shaping
Skinner had rat in box, pressed lever to get reward. To get rat to press lever, shaping is used that rewards rat for behaviour that resembles pulling lever until he actually does pull lever. helps animals adapt to behaviour in environment
Superstition
Skinner had fixed interval reward. animal would do action that think got him the reward and would do weird stuff. athletes do the same thing
animal training techniques
baiting: placing reinforcer in location and then reinforcing animal makes them move to location
mimics: motivation to copy behaviour of other animals/trainer to get rewarded
Sculpting: forcing certain behaviours to make animal do something and reinforcing, so when unforced animal will complete this behaviour
instructions
Chaining
Behaviours that are made up of smaller steps. reinforce each step can be completed backwards or forwards. e.g. tieing shoelaces
positive/negative rewards/punishment
positive reward - gaining good
negative reward - losing bad (good)
positive punishment - gaining bad
negative punishment - losing good (bad)
Bridging
reinforcing when not right there with animal. similar to classical conditioning. animal will know a signal that knows it did something good and will come back to gain reinforcement
schedules of reinforcement
continuous = each time done
partial = sometimes
ratio = instances of behaviour
interval = time from behaviour
fixed interval, fixed ratio, variable interval, variable ratio
post-reinforcement pause for fixed schedules
which schedule is best, why?
variable ratio, because unknown when get reward so animal will consistently be doing the behaviour waiting for reward to come
effective punishment
continous (every time), no delay from action to punishment, dont give pity reinforcement, reinforce opposite behaviour
Reward variables
drive, must want the reward
size, bigger reward = quicker acquisition but quicker extinction,
delay, give reward immediately
three term contingency
discriminative stimulus = must discriminate when the right time/environement to execute the behaviour
operant response = behaviour itself
outcome = consequence