Learning Flashcards
What is learning?
A long lasting change in behavior caused by the environment
What is the law of effect?
proposed by Thorndike
a response followed by a favourable consequence will be more likely to occur again
What is a response?
an item of behaviour
What is a reinforcer?
a favourable consequence
What does the law of effect say will happen?
A response followed by a reinforcer will increase where all other unsuccessful responses will decrease and disappear
What is an experimental chamber?
A chamber where controlled experiments on animals occur
What is needed in a control environment?
a response (the bar to be pulled) a reinforcer (the food) a stimuli (the sound of the hopper)
What is operant or instrumental behaviour?
behaviour key to the law of effect
operant behaviour is behaviour controlled by its own consequences
What is operant conditioning?
learn an association between a response and its consequence
What is a contingency?
the response causes the reinforcer. Or the reinforcer is contingent on the response
What is shaping?
the method of successive approximations.
Reinforcing closer and closer approximations of the response that we want.
What are some hints for shaping?
identify a suitable reinforcer
reinforce as immediately as possible
too many reinforcers and the animal will satiate
What is postive reinforcement?
The adding of a favourable event
What is a postivie contingency?
if the event is presented
What is a negative contingency?
If the event is withdrawn
What is punishment?
an adversive event. event is less likely to occur
What is reinforcement?
a favourable event. event is more likely to occur
Escape and avoidance are two types of what?
negative reinforcement
What is escape?
response terminates ongoing event
What is avoidance?
response prevents onset of aversive event
What is the Miller-Mowrer box?
the box where the dog jumps to avoid getting shocked. initially escape but as he learns it becomes avoidance
What is the avoidance paradox?
Where youre not really avoiding the consequence but you’re avoiding the conditioned anxiety that comes with the consequence
What is a primary reinforcer?
reinforce behaviour because of their innate biological significance
What is a secondary reinforcer?
previously neutral stimuli that have acquired reinforcing properties by being paired with primary reinforcers in the organism’s history. Also called conditioned reinforcers because reinforcing property is learned not innate.
What is continuous reinforcement?
every response is reinforced
What is intermittent reinforcement?
not continuous. usually adhering to a schedule of reinforcement. Invented by skinner so he could go away for the weekend
What is the partial reinforcement effect?
responding maintained by intermittent reinforcement persists much longer during extinction than one maintained by continuous reinforcement
What are schedules of reinforcement?
rules that specify when a response will be reinforced
What is a ratio schedule?
a schedule based on the number of responses emitted
What is an interval schedule?
A schedule based on the time thats lapsed.
What is a cumulative record?
a graph showing the different patterns of learning behaviour based on the schedule used
What is a fixed ratio schedule?
reinforcement is contingent on the last of a fixed number of responses emitted since the last reinforcer. Has a burst and break pattern with the break after every reinforcer
What is a variable ratio schedule?
reinforcement is contingent on the last of a variable number of responses emitted since the last reinforcer. VR 10 means every 10th response on average is reinforced. High constant rate of responding
What is a fixed interval schedule?
a response is reinforced after a fixed amount of time has elapsed since the last reinforcer. Like FR but with a smoother transition from break to burst o becomes scalloped. fixed intervals on the x axis.
What is a variable interval schedule?
A response i reinforced after a variable amount of time has elapsed since the last reinforcer. Like VR but usually a lower response rate. Smooth as again can’t tell when reinforcers are due.
What is differential reinforcement of other behavior (DRO)?
How to get rid of a behaviour. A reinforcer is given when a fixed amount of time has elapsed since the last response.
explicitly reinforces not responding.
Also called omission training
eliminates behaviour more effectively than extinction
What is a Fixed Time schedule?
a reinforcer is delivered when a fixed amount of time has elapsed since the lsat response. Like FI but no response required so. non-contingent or response independent.
What happens when there is no response/reinforcer contingency?
If by chance a response is followed by a reinforcer you get superstitious responding. with adventitious reinforcement of whatever response was occurring just before the the reinforcement.
so superstition may be failure to discriminate between a lack of contingency
What is the complex version of the law of effect?
When a response is followed by a favorable consequence it become more likely that the response will occur again in the context in which it was reinforced
What are the likely results of stimulus control?
the amount of generalization depends on how close the stimulus is to the training stimulus. (the pigeon and the different coloured keys.
What is stimulus control?
the extent to which stimuli that precede or accompany operant behavior come to affect the rate or probability of that behavior.
The more stimulus control the more….
discrimination
The less generalisation the more ….
stimulus control
What are generalisation gradients?
the slope of the gradient measures stimulus control
What is a generalisation test?
After reinforcing responses to a particular stimulus, present several different stimuli, in extinction, and measure responses to each
What are incremental gradients?
What if we gave punishment at the original stimulus and then did a gen test? The graph would go the opposite way. the response would increase. Incremental gradients are often flatter than decremental gradients
What is discrimination training?
making the context relevant by reinforcing responses in presence of one stimulus (s+) but not in the presence of another (s-). across a cumulative record the responding to each stimulus gets more and more discriminative
What is intradimensional discrimination training?
S+ and S- are on the same dimension. e.g red key vs green key
What is interdimensional discrimination training?
S+ and S= are not on the same dimension. Each yellow key vs white key with a black vertical line.
Is the gradient steeper after single stimulus training or interdimensional training?
After interdimensional training. There is more stimulus control so the gradient is steeper
What happens to the gradient after intradimensional training?
there is even more control because there is more attention being given to the dimension thats being tested for. There is a positive peak shift. The most responses are not at S+ but on the opposite side of S+ from S-
What is concept formation?
A concept is formed if an animal learns that it should both peck the matching stimulus and shouldnt peck the non matching stimulus
What is MTS?
Matching to sample
What is DMTS?
delayed matching to sample. Where a delay is introduced between the sample stimulus and the comparison stimuli.
accuracy decreases as delay increases. Typical decay curve
What is respondant conditioning?
Learning the difference between two stimuli.
aka classical or pavlovian conditioning
What is and unconditioned stimulus?
A normal stimulus that elicits a normal unconditioned response
What is a neutral stimulus
A stimulus that when paired with an unconditioned stimulus becomes a conditioned stimulus that elicits a conditioned response
whats the difference between a multiple schedules and concurrent schedules
multiple schedules have two simple schedules of reinforcement available alternately, one at a time.
Concurrent schedules have two simple schedules available simultaneously
What is the matching law?
so the percentage of responses on one alternative matches the percentage of reinforcers from the alternative.