Learning Flashcards

1
Q

Pavlovian conditioning

A

The acquisition of a new behavioural (or physiological)
response to a previously neutral stimulus as a result of
experiencing a predictive relationship between it and a
biologically relevant stimulus.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Unconditioned Stimulus (US)

A

A stimulus that has natural relevance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Conditioned Stimulus (CS)

A

A stimulus that gains its relevance through learning

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Explain the Pavlovian experiment

A

Pavlovian conditioning was first described by the physiologist Ivan Pavlov (hence
“Pavlovian”). He was conducting experiments on the reflexive control of digestion in dogs
when he came across a confound in his experiment. Instead of salivating when presented with
food, the dogs were starting to salivate as soon as the experimenter walked through the door.
Pavlov hypothesized that the dogs had come to associate the entrance of the experimenter with
the imminent arrival of food. He went on to conduct an experiment in which the signal that
predicted the food was the sound of a bell. Just as he had predicted, after a number of pairings
the dogs began to salivate as soon as they heard the bell, suggesting that they had successfully
associated the bell with food. Because the change in the dog’s behaviour was conditional on
the learning experience, Pavlov named the signal (in this case the bell) the conditioned
stimulus (“CS”), while the outcome (in this case, the food) is called the unconditioned
stimulus (“US”) because its effect on the dog’s behaviour was not conditional on the learning
experience. He named the response to the CS after training (in this case, salivation to the bell)
the conditioned response (“CR”) because the ability of the CS to elicit this response was
conditional on the learning experience.
Pavlovian conditioning can be appetitive or aversive. i.e. an association can be formed
between a CS and a pleasant US (e.g. food) or a CS and unpleasant US (e.g. electric shock). In
both cases, there is an increase in the vigour of the CR with the strengthening of the association
between the CS and US, because whatever the natural response would be to the US (be that
approach, fear, disgust, salivation etc.) is now the response to the CS – i.e. is that which
becomes the CR.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Necessary Conditions for learning

A

(1)Awareness of the CS-US relationship
(2)Biological Preparedness
(3)Neurobiological Dissociations
(4)Temporal Contiguity
(5)Food Aversions
(6)Blocking

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Awareness of the CS-US relationship

A

In animals, we are mostly only able to assess the degree to which learning has occurred by
investigating a chance in outward behaviour. Human predictive learning, however, can be
assessed by two measures: 1) a change in the behavioural (or physiological) response to the
CS; 2) a change in the cognitive expectancy of the US following presentation of the CS
(usually measured by verbal self report). Often there is good concordance between these two
measures, raising the question of whether there is a causal relationship between cognitive
expectation and the acquisition of the behavioural response. Lovibond (Psychophsiology 1992
29 621) demonstrated this concordance. He paired some pictures of plants with electric shock
(CS+) and presented other plant pictures alone (CS-). CS+ and CS- trials were intermixed,
and participants were asked to rate the likelihood of shock to each CS. Those who rapidly
learnt this discrimination additionally showed a skin conductance response (SCR) on CS+
trials. No SCR was observed in participants who were unaware of the relationship between
CS and US.
Evidence that the cognitive expectation causes the SCR comes from the fact that when
Hugdahl and Ohman (J. Exp. Psychol: Human Learning and Memory, 1077 3 608) instructed
participants that there would be no more shock in an extinction phase (repeated presentation
of the CS without the US). In extinction, responding typically declines steadily across trials.
However, instructed extinction resulted in the immediate disappearance of the SCR. This is an
example of explicit learning, whereby awareness of the relationship between CS and US
causes the presence or absence of the behavioural response.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Biological Preparedness

A

There are some demonstrations of Pavlovian conditioning which appear to result from
implicit learning. For example, Hugdahl and Ohman also used fear relevant stimuli such as
snakes and spiders as CS+ and CS-. In contrast to the fear irrelevant stimuli used before
(flowers, houses), instructions to the participants that these stimuli would no longer be followed
by shock had little impact on extinction. This appears to be evidence of Implicit learning:
behavioural (or in this case physiological) change that occurs independent of cognitive
expectation. This type of learning may be facilitated in situations when the CS is highly
biologically relevant – for example signalling a potentially dangerous aversive US.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Neurobiological Dissociations

A

Bechara et al (Science 1995 269 115) studied fear learning in three patients with different
types of brain lesions, all involving bilateral damage to subcortical structures in temporal lobe:
the amygdala (AMG), the hippocampus (HC) and both AMG and HC damage (AMG+HC).
The AMG and AMG+HC patients failed to acquire an SCR response to a stimulus paired with
an aversive US (electric shock), whereas the HC patient showed normal conditioning.
However, when questioned afterwards about which CSs were or were not followed by the US,
the patient with AMG damage was able to report the associations accurately whereas the
patient with HC damage and HC+AMG damage were not. These findings demonstrate a
double dissociation between implicit SCR learning mediated by AMG, and explicit cognitive
learning mediated by HC.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Temporal Contiguity

A

One factor that determines learning is the temporal contiguity (closeness in time) of the
events involved in the learning episode – the longer the interval between the events, the less is
learned about their relationship.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Food Aversions

A

A common phenomenon that breaks the “rule” of temporal contiguity is conditioned food
aversion, whereby the taste of the food (or drink) and subsequent nausea are separated by an
interval of hours. Andrykowski and Otis (Appetite,1990, 14 145) interviewed patients about
types of food consumed prior to a chemotherapy session and took preference ratings. Many
patients developed aversions to new foods tasted prior to chemotherapy administration –
However, there was no relationship between conditioned food aversions and a) time between
that food consumption and chemo b) time between that food consumption and vomiting. Thus,
temporal contiguity is not always necessary.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Blocking

A

Contiguous pairings of events are also not always sufficient to bring about learning. This is
illustrated by the phenomenon of blocking. Block experiments tend to follow a particular
procedure: In stage 1, a stimulus (A) is paired with a US in stage 1 (this would be notated as
“A+”) while another stimulus (B) is not (“B-”), and this happens over repeated trials. In a
subsequent stage, A is paired with a new stimulus (X) and B is paired with another new
stimulus (Y). Both compounds are paired with the US (“AX+” and “BY+”) over multiple trials.
X and Y are then presented alone, and the behaviour of the participant observed. What is
normally found is that conditioned responding is seen to Y, but not to X. That is to say, the
participant has learned about Y, but not about X, despite having equal exposure to both. The
presence of A along with X in the second stage blocks an association forming between X and
the US. This difference can also be seen in neurological indices of learning. In an fMRI
experiment by Tobler et al (Journal of Neurophysiology, 2006 95 301), activation changes in
ventral striatum were seen more in BY+ trials than in AX+ trials. Furthermore, they found that
activity in an anterior region of orbitofrontal cortex during learning BY+ trials correlated with
the degree of behavioral difference.
Blocking can be thought of as resulting from predictive learning. In the first stage, over
multiple trials, A became a perfect predictor of the outcome. In the presence of A, X is not
learned about in the second stage, because the US is already fully predicted by A. This
“prediction error” account suggests that learning occurs when there is a discrepancy between
how much a US is expected (given the CSs present) and whether or not the US actually occurs.
We will explore this idea more in the next lecture.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Neural Basis of Pavlovian Conditioning

A

There is evidence that the dopamine (DA) system plays an important role in the acquisition of
associations, particularly when the learning is appetitive. In particular, midbrain areas such as
the ventral tegmental area (VTA)and Substancia Nigra (SN) have been the focus of much
research. Animals will work for intracranial self-stimulation (ICSS) of these areas (Olds &
Olds, 1963), and the administration of dopamine agonists increases rate of responding during
ICSS while dopamine antagonists attenuate responding (Gallistel & Karras, 1984).
Dopaminergic neurons in the substantia nigra and ventral tegmental area (VTA) in the brain
stem project to many forebrain structures, particularly the striatum, where they may facilitate
neural processing by releasing DA as a neurotransmitter. The VTA can be thought of as
projecting mostly to the ventral striatum, while the SN projects to the dorsal striatum.
O’Doherty et al (Neuron 2002 33 815) conditioned (trained) people to expect an appetitive
glucose solution following one picture CS and an aversive salt solution to another CS. They
conducted fMRI imaging, during the presentation of the two CSs.

The substantia nigra and VTA were more active during the CS associated with the appetitive glucose US relative to the CS paired with the aversive salt US.

Interestingly, the ventral striatum was more active during the anticipation of the reward than the receipt of it. There have been many different theories as to the exact role of dopamine in learning.

One hypothesis is that dopamine mediates the
hedonic or reward value of a stimulus (Wise, 1985).

Alternatively, the dopamine system may
be involved in incentive motivation, and could play a role during the anticipation of reward that corresponds to a motivational state of wanting or craving (Berridge, 1996).

Another hypothesis
is that dopamine functions as a prediction-error during reward learning (Schulz et al., 1997).

Evidence for the latter hypothesis comes from single cell neurophysiology recordings in which
(in the absence of a learned predictive cue), dopamine neurons have been found to respond to
the delivery of the reward itself, but after learning about a predictive cue, the neurons shift their
responses and respond instead to the presentation of the cue (Mirenowicz and Schultz, 1994)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Mechanisms of associative learning

A

Rescorla-Wagner rule.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Expectation and Surprise

A

Learning requires that the occurrence of the US is unexpected or surprising. Learning proceeds
in a negatively accelerated curve: with each CS-US pairing, expectation of the US increases
and surprise decreases. This is captured by an elegant and simple learning algorithm – the
Rescorla-Wagner rule. This rule states that increases in the associative strength of the CS (i.e.
the amount of expectation) results from the degree to which current associative strength
deviates from perfect learning (i.e. the amount of surprise). This deviation is known as prediction error. The Rescorla Wagner rule is captured by the following equation:

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Excitatory and Inhibitory associative strength

A

Contiguous pairings of events are not sufficient to bring about learning (see the example
of blocking in the previous lecture). It also requires that the occurrence of the outcome is
surprising or unexpected. The Rescorla-Wagner rule formalizes this principle. It accounts
for acquisition by the CS of both excitatory and inhibitory associative strength. Thus, the
change in associative strength of a CS is determined by the discrepancy between perfect
prediction () and the sum of the current associative strength of all CSs present on that
trial (∑V).

When the US is present, =1, and the associative strength is excitatory (anticipation of the
presence of a US). When the US is absent, =0 and presents the opportunity for inhibitory
associative strength to accrue (anticipation of the absence of a US). According to
Rescorla-Wagner, this happens when a CS (e.g. CSB ) is presented with an excitatory CS+
(e.g. CSA) that has been fully learnt about previously, and no US is presented. Under these conditions,  is zero (no US), but ∑V is positive because the presence of CSA predicted
that there would be a US. Therefore CSB will acquire inhibitory associative strength
(predicting the absence of a US that you’d otherwise expect to be there).

Lovibond et al (Behaviour Research and Therapy, 2000 38 967) demonstrated that a
conditioned inhibitor (E-) protected a conditioned excitor (C+) from extinction in human
participants. In the first phase, A was paired with shock on some trials (A+) and a
compound of AE was presented alone (without shock, AE-). This provides the conditions
for inhibitory associative strength to accrue to E. The control stimulus, B-, was never
paired with shock or an excitatory CS, and therefore neither predicted the presence or
absence of the US. Two more stimuli C+ and D+ were also trained as conditioned
excitors. Following this training, C and D were presented in an extinction phase (no US).
However, C was presented in a compound with E (CE-), while D was presented alone.
The following test phase showed that participants produced both SCR responding and an
expectation of the shock following C, but no response to D. Thus, the presence of the
conditioned inhibitor E with C in the extinction phase fully predicted the absence of
shock, so protecting C from losing excitatory associative strength.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Superlearning

A

Superlearning is a symmetrically opposite phenomenon to blocking. Turner et al (Cerebral
Cortex, 2004 18 872) asked participants to imagine that they were allergists whose task it
was to establish specific allergies in their patients. They were presented with images of
different foods, and pressed one button if they thought it was an allergen and another if
they thought it was neutral. They first trained one food picture (banana) with an allergic
response before pairing that food with another (mushroom) and omitting the allergic
outcome. This resulted in the mushroom gaining inhibitory associative strength. In a third
stage, the mushroom was paired with a novel pear stimulus and followed by an allergic
outcome. The effect was superlearning – in the presence of the conditioned inhibitor, the
allergic outcome was even more surprising than had the novel pear been conditioned by
itself. Moreover, right PFC activation was also observed during Stages 2 and 3 when the
prediction errors should have been large.

17
Q

Attention and Learning

A

What the Rescorla-Wagner theory does not explain is the effect of simply preexposing a
CS on subsequent conditioning of that CS. For example, Nelson and Sanjuan (Quarterly
Journal of Experimental Psychology 2006 59 1003) trained participants on a video game
in which they had to learn not to fire at invaders in the presence of a warning light
(conditioned suppression)). They found that simply preexposing stimulus A prior to
pairing it with the US retarded learning about A compared to a non-preexposed stimulus
B. This is known as latent inhibition (LI). The term is a misnomer, because A does not
acquire inhibitory associative strength during the preexposure phase (which should reduce
suppression). When A was tested in compound with another cue B+ that elicited
suppression in a summation test (AB-), it did not reduce suppression relative to a nonpreexposed cue (Gen Dec on the graph).
Because there is no prediction error during pre-exposure (an outcome is neither
predicted nor occurs), LI cannot be explained by the Rescorla-Wagner theory. Pearce and
Hall
suggest that LI is due to a loss of attention to the pre-exposed stimulus. They argue
that attention to a stimulus is necessary for learning about it and that it is not necessary to
attend to a cue when the outcome is perfectly predicted, because there is nothing more to
learn. This is true during stimulus preexposure and therefore participants stop attending to
the cue. This loss of attention retards subsequent acquisition of the CS-US relationship
following CS preexposure.
The Pearce-Hall theory predicts that sustained attention should be observed when a
cue is unreliably associated with an outcome. To test this prediction, Hogarth et al
(Quarterly Journal of experimental Psychology 2008 61 1658) presented localized visual
cues that perfectly predicted either the occurrence (AX+) or the absence (CX-) of a loud
white noise outcome. Once the participants had learned to anticipate the outcomes
associated with A and B, they showed relatively few visual fixations on the cues.
However, when a stimulus B was paired with the outcome on only half of the trials (B+/-)
so that the outcome was always uncertain, the subjects showed many fixations on the cue
and therefore sustained attention to it.

18
Q

Instrumental (operant) conditioning

A

Because CRs are simply natural reflexes that the individual learns to produce
to a CS, it is assumed that Pavlovian CRs are innately specified and that their form is
a consequence of evolutionary processes: salivation for food, flinching for pain, etc.
However in order to control important events, an animal or person must also learn to
change their behaviour in response to its consequences or outcome
– not just learn to
anticipate and outcome and behave in response to that anticipatoin. Thorndike
(Animal Intelligence, 1911) was the first to demonstrate instrumental (or operant)
conditioning when he showed that cats could learn to press a lever in order to get
certain outcomes. Instrumental conditioning occurs when experiencing a causal
relationship between a behaviour/response and its outcome changes the likelihood of
that behaviour (/level of responding).

19
Q

Principles of reinforcement

A

We saw previously in the case of Pavlovian conditioning that, irrespective of
whether the US is appetitive or aversive, conditioning results in an increase in the
vigour of the conditioned response. In instrumental conditioning, if a response is
followed by a pleasant outcome - positive reinforcement - the probability of the
response increases. If the response is followed by the removal of an unpleasant
stimulus - negative reinforcement - the probability of the response increases. If the
response is followed by an unpleasant outcome - positive punishment - or a pleasant
outcome is omitted – negative punishment -, the response decreases. Note that
“positive” and “negative” here do not mean “pleasant or unpleasant” but are used in
the mathematical sense of present or absent, while “reinforcer” and “punisher” denote
the “niceness” of the outcome.

20
Q

Schedules of reinforcement

A

There are many ways of arranging a response-outcome relationship. A ratio
schedule
arranges that on average each response produces a reinforcer with certain
probability so that the ratio of the number of responses to the number of reinforcers is
fixed. Under a ratio schedule, a higher rate of responding yields a higher rate of
reinforcement. However, many organic resources deplete and only replete after a
certain time has elapsed so that there is a limit on the rate of reinforcement. These
resources are modeled by an interval schedule under which the next reinforcer only
becomes available at a certain time after the last one (no matter how much you
respond in the meantime). To compare the performance on the two types of schedule,
Matthews et al (Journal of the Experimental Analysis of Behaviour, 1977, 27, 453)
rewarded lever pressing with a monetary reward on a variable ratio (VR) 30 schedule.
Every time a reward was delivered on the VR schedule, a reward became available for
another participant, thereby generating a yoked variable interval (VI) schedule with a
similar rate of reinforcement.
They demonstrated that the VR schedule maintained a
much higher rate of responding than the yoked VI.

21
Q

Law of Effect.

A

In his “law of Effect”, Thorndike (1911) proposed that positive (and negative)
reinforcers strengthen (i.e. literally reinforce) an association between stimuli present
when the response is performed and the response itself – stimulus-response or S-R
learning. This establishes a habit of producing that response when in the present of
that stimulus. There is evidence that habitual responding is mediated by the dorsal
striatum
, which is part of the sub-cortical motor system. O’Doherty et al (Science
2004 304 452) arranged a discrimination in which touching one of two
simultaneously presented visual stimuli yielded a fruit juice outcome with a higher
probability than the other stimulus. To determine to what extent the fMRI BOLD
signal was due to the instrumental choice, other participants just received yoked
pairings of the stimuli with the outcome without making the choice response (i..e they
just learned a pavlovian association between the stimulus and outcome). A contrast
between the two tasks showed more activity in the dorsal striatum during the
instrumental task.

22
Q

Goal-directed behaviour.

A

The S-R/reinforcement process establishes instrumental habits that are not mediated
by any knowledge of the outcome of the response – it is simply a connection between
a stimulus and a response. Consequently, this process does not allow the purposeful
selection of an instrumental action on the basis of the agent’s current goal. Whether
an instrumental response is the result of an S-R habit or is a goal-directed action can
be determined by an outcome devaluation test. For example, Klossek et al (Journal
of Experimental Psychology: General, 2008 137 39) trained young children to touch
one icon stimulus for a clip of one tv show (Pingu) as the outcome and another icon
for a different show (Teletubbies). One of the cartoons was then devalued by
prolonged exposure to induce boredom with it. The children were then tested on a
choice between touching the two icons in extinction - without the shows being played.
If the original training had simply established icon-touch response habits but
no knowledge of the outcome of each response (the specific show), then the children
should have performed both responses equally during the test. However, if the
children had learned which cartoon was produced by each response - such that
responses were goal-directed - then on test they should have performed the response
that had produced the (now) more valuable outcome rather than the one that produced
the devalued (now boring) show.
The older children (>27 months) showed this outcome devaluation effect,
thereby demonstrating that their responses were goal-directed. By contrast, the
younger children (<27 months) performed both responses equally.
The absence of a devaluation effect in younger children was not because of a
failure of the exposure to devalue the cartoon outcome in this age group: For two
other groups of children the test was performed with outcomes still available (rather
than in extinction) and here the younger children chose the response producing the
still-valued outcome in preference to the response producing the now-devalued
outcome.

23
Q

Neural correlates of goal directed and habit learning

A

De Wit et al (Journal of Neuroscience 2009 29 11330) designed a task which pitted goal directed learning against S-R habit learning and conducted fMRI to examine cortical regions involved in each type of learning. Participants learn three types of discrimination.

1) Control: stimulus and outcome are different and can be
learned using goal directed S-O-R chains or habit S-R associations;

2)Congruent:
stimulus and outcome are same and can be learned using goal directed S-O-R chains
or habit S-R associations;

3) Incongruent: The stimulus of one problem is the same as
the outcome in the other. Goal directed learning will activate S-O-R of one chain and O-R of the other, thereby causing response conflict. No such conflict occurs is learned using two S-R associations, so habit learning will “work” but goal-directed learning will be confused.

They then gave subjects an instructed devaluation phase, whereby two open boxes with outcomes inside were presented. One box had a cross, showing that this outcome no longer produced points. Performance on trials involving the stimuli and outcomes of the goal-directed control and congruent conditions should not be affected (i.e. they should easily learn to stop responding to stimuli that are no longer rewarded). Performance should decrease on trials that involved stimuli and outcomes of the incongruent condition (that required S-R learning). The behavioural results supported these inferences – participants continued selecting the unrewarded stimuli in the incongruent condition, but easily stopped doing so in the congruent and control conditions.
The scanning results during learning showed greater ventromedial PFC activation in goal directed learning compared to S-R learning. In contrast, greater dorsomedial PFC activation in S-R trials. In the devaluation phase, there was greater activation of ventromedial PFC on goal-directed than on S-R trials.
In contrast to De Wit et al’s study, Valentin et al (Journal of Neuroscience, 2007, 27, 4019) manipulated the incentive value of the reinforcer directly (i.e. by experience rather than instruction) by sating participants on a particular food reward in a scanning experiment. Subjects first learned an association between a visual stimulus and an action that produced a particular outcome (chocolate or tomato) with a probability of 0.4. Additionally, both actions produced a common outcome (orange juice) with a probability of 0.3. Therefore one action produced a food outcome with a high probability of 0.7 and another produced a food outcome with low probability of 0.3.

There were also control trials where the outcome was a tasteless liquid. After being trained in the scanner, subjects consumed one outcome (either chocolate or tomato) to satiation, decreasing the incentive value of that outcome. They were returned to the scanner and given the same instrumental choice procedure in extinction.

Scanning results during initial learning of the discriminations showed greater vmPFC (orbitofrontal) activation in the rewarding trials compared to neutral outcome trials. Similarly, after devaluation by satiation, greater medial and central PFC activity was observed during high probability choices in the still-valued compared to the devalued condition. These results suggest that this region is sensitive to the incentive value of reinforcers.

24
Q

Generalisation

A

So far we’ve been talking about how associations are formed between individual
stimuli. However, learning can be made more efficient if learning systems are able to
take into account similarity between a new stimulus and those previously learned
about. Stimulus similarity will allow responding to generalize from previously
learned associations to new but similar stimuli. Generalisation should decline with
distance (in terms of similarity) from the trained stimulus. Associative learning theory
predicts that the degree of generalization of associative strength of a CS depends on
the interaction of associative strength of all other similar CSs. We can think of
different CSs as lying along a dimension – for e.g. colour. If one stimulus (e.g. red) is
reinforced (S+) and another (e.g. yellow) is not (S-), excitatory and inhibitory
gradients will overlap for a stimulus lying half way between (orange). What if you
instead train orange+(O+) and yellow- (Y-) ? A stimulus close to O+ on the colour
dimenson but futher away from Y- (e.g. reddish orange (RO)) would have positive
generalization from O, but less negative inhibition from Y… and thus
counterintuitively might end up with more net excitatory associative strength than the
original trained stimulus (O). Responding is therefore greater to RO than to O. We see
this “Peak Shift” in pigeons, but usually not humans. Humans tend to use a relational
rule that operates in the opposite way to that predicted by associative theory - the
further along the dimension a new stimulus is from the trained stimulus, the more
likely the person is to regard it as an example of the rule.
Is it the case that for complex learning problems, humans use a cognitive system for a
solution, rather than an associative one? Or is it that the cognitive system will search
for a solution and – if it finds one - dominate over an associative system? This was
assessed by using a procedure that obscured the rule structure, making a cognitive
solution impossible. Wills & Mackintosh (1998) used an artificial dimension – a set
of icons. E.g. S+ contained icons D, E, F, G and H and S- contained E, F, G, H and I.
In order to prevent people identifying D and I as the discriminating icons, the number
of instances of each icon type appearing in a stimulus on any particular trial was
allocated with a certain probability. This ensured a high variability of the number of
instances of a particular icon from trial to trial, thereby preventing rule-based learning
about particular icons. However, across the many training trials of the experiment,
there were an average number of instances of each icon in the S+ and S- (see slide).
This allowed excitatory and inhibitory associative strength to accrue to the training
stimuli. People were tested on new stimuli, containing some icons from further along
the dimension. Their performance demonstrated the peak shift effect.

25
Q

Categorisation

A

Categories are formed on the basis of the similarity between a set of exemplars. We
can think of a set of exemplars as having a central tendency – those features of an
exemplar that are most diagnostic of the category, or “prototypical”. Exemplars with
small variation from the prototype (“low distortion”) will be regarded as highly
typical members of the category while those with large variation (“high distortion”)
will be regarded as less typical members. Experiments manipulating the distortion
from the prototype produce the prototype effect (better classification of the prototype
during the test phase, even if its never been seen before, than other new exemplars)
and the typicality effect – less accurate classification of high distortion exemplars
compared to low distortion exemplars. One theory (exemplar theory) suggests that
these effects result from an explicit comparison of a new exemplar to stored memories
of all previously learned exemplars. However, this comparison might equally be
conducted implicitly. In a categorization experiment, Squire & Knowlton (Proc. Natl.
Acad. Sci. USA, 1995, 92, 12470-12474) observed that a severely amnesic patient
E.P. with bilateral hippocampal damage demonstrated the prototype effect without
being able to recognize any trained stimuli. This can be explained by prototype
theory, which states that during training on a set of exemplars, the prototype is
abstracted and generalization to new exemplars will be based on similarity to the
prototype.

26
Q

Rule Learning

A

There are forms of learning that cannot be solved associatively, based on the rules
we’ve discussed - such as the extraction of a general rule during learning. Shanks &
Darby (J Exp Psychol Anim Behav Proc 1998 24 405) trained complex discrimination
problems where participants had to learn which “meals” caused an allergy (the
outcome). In the positive patterning (+ve) discrimination, although each of two
foods did not cause an allergy by themselves, they did so in compound. In the
negative patterning (-ve) discrimination, both foods caused the allergy outcome by
themselves but not in compound. These discriminations are difficult to solve
associatively because the compound predicts the opposite outcome to the elements.
However, the discriminations could be solved by learning a general +ve or –ve rule.
To test whether the subjects had learned the rule, they also received training on only
half of two transfer discriminations, just two foods alone either paired with outcome
or not. Then they were presented with the compounds of the transfer discriminations.
If they generalized the rule (A+ B+, AB-), they should rate the compounds as
producing the opposite outcome to the two elements they have learned about
separately. The experiment showed that those that had learned well during training
(“high” group) responded according to the rules, whereas poor learners (“low” group)
tended to respond according the summed values of the elements, i.e. associatively.

27
Q

Cognition vs association

A

The previous experiments demonstrate that associative processes can produce very
different solutions to cognitive processes trained on the same information. Should
humans rely on cognitive process? Are these more reliable in producing the “correct”
solution? It depends on the situation. Perruchet (1985) made use of the gambling
fallacy, a cognitive bias that people adopt under conditions of uncertainty: With each
successive “win” a person assumes that their luck is going to run out and are therefore
less likely to place a bet following several wins, and similarly, with each successive
loss, are more likely to place a bet in the belief that their luck will change. Perruchet
arranged that 50% of CS trials were followed by a US across the session. However,
there were 1-, 2-, 3- or 4-trial runs of CS+ (acquisition) and 1-, 2-, 3- or 4-trial runs of
CS- (extinction). When both explicit and implicit measures were taken, the eye-blink
(implicit) CR behaved associatively, and the explicit expectancy ratings obeyed the
gambler’s fallacy.