Lecture 5 - The neural basis of goal-directed and habit-based behaviour Flashcards
1
Q
How does an action become learned?
A
- Thorndike puzzle box - learning is more of a gradual process and figuring it out through gradual trial and error learning (accidental) and then repeat the action because it has been reinforced
- the shape of the learning curve suggests learning is an incremental process reflecting the strength of an association between stimulus and response
- thorndike law of effect - if in presence of a s timulus a response is followed by satisfying outcome the s-r connection is strengthened. if followed by annoying/aversive outcome connection will weaken
2
Q
S-R vs R-O associations
A
- A stimulus-response association results in responding without any expectation that the response leads to an outcome (habit based learning)
- a response-outcome association results in responding because there is the expectation that the response will result in the outcome (goal directed)
- can tell by using outcome devaluation test - is there knowledge and expectation of outcome? if yes it is goal-directed.
> train with reward
> devalue reward via massive amounts
> test if they press lever
> if they continue to press = habitual, if less = goal directed
3
Q
Baxter and Murray 2002
A
- in training task 1 the monkey gets cherries for choosing correct object and nothing for incorrect object
- in training task 2 the monkey gets peanuts for choosing correct object and nothing for incorrect.
- prior to test monkey allowed cherries or peanuts until sated. in test monkey is allowed to choose one of the correct objects
- monkey chooses the object associated with the non-devalued reward
4
Q
adams and dickinson 1981
A
- LiCl is taste aversion stimuli
- when devalue sucrose, they avoid lever 1 (paired with sucrose) amd massively press lever 2 instead. and vice versa = instrumental learning
- other factors also involved: manipulated training length
- even after taste aversion, putting extended training group into test do not show devaluation of lever, press each equally
- limited training = goal directed
- extended training = more habitual
- instrumental learning is initially sensitive to outcome devaluation (goal directed) if overtrained it is no longer sensitive to outcome devaluation
- the expression of different components of the association underlying behaviour changes with repetition
- outcome devaluation procedures useful in:
> OCD: patients show insensitivity to outcome devaluation
> stress less sensitivity to outcome devaluation
> actions learned under influence of alcohol are less sensitive to outcome devaluation = more habitual
> alcohol seeking behaviour becomes insensitive to devaluation if over trained e.g. seen in alcohol misuse disorder
5
Q
cortico-striatal system
A
- dorsomedial striatum - goal directed behaviour
- basolateral amygdala - incentive value
- medial pfc - infralimbic vs prelimbic, encoding associations, SR learning
6
Q
dorsomedial striatum
A
- the role of posterior dorsomedial striatum in R-O learning:
- Yin et al (2005)
> sham group still had active dorsomedial striatum. show classic R-O.
> lesion group pressing less levers overall but no difference between devalue and non-devalued levers - dorsomedial striatum important in goal directed learning
> extinction test: if animal doesnt make the response for devalued outcome (but do for non-devalued) it must be that they expect a response to cause outcome (goal directed)
> but if an animal is impaired on the extinction test it may be because they cannot discriminate between rewards
> in exitinction test any change in lever pressing due to devaluation test is due to expectation & knowledge of outcome (R-O)
> in rewarded test any change in lever pressing caused by impact of devalued outcome during test phase on s-r learning - dorsomedial inactivation
> control shows reward evaluation in both tests
> lesion - in extinction (requires expectation) failed to change lever pressing but can learn outcome of pressing = reward. can tell dif
> dorsomedial striatum needed for goal-directed R-O learning. rats that received inactivation were not sensitive to outcome devaluation in extinction test. It is the outcome expectation this is important for - rats that received inactivation behaved in habitual manner
7
Q
basolateral amygdala
A
- representing the sensory specific properties of an outcome (can discrim between rewarded outcomes)
- not necessary for acquiring instrumental responding suggesting S-R learning occurs independent of BLA function
- is necessary for knowing the current incentive value of an outcome
- Balleine et al (2003) - shams and BLA lesion able to learn lever press and chain pulling at same degree. lever may lead to pellet, chain lead to sweeter reward.
- when devalue one reward, sham do the valued action not devalued, but BLA do both similarly
- ## BLA necessary for the outcome representation and using sensory specific info to learn which outcomes are rewarded and which are not. BLA lesion cannot tell which outcome is rewarded.
8
Q
medial pfc
A
- consists of the infralimbic cortex and the prelimbic cortex. these have dissociable roles.
- killcross & couureau (2003) - pfc links to overtraining leading to habit learning. compared infralimbic vs prelimbic cortex lesions and sham in low (goal directed) and high training (habit learning) in outcome devaluation test
- sham show goal directed in low training, IL group shows same goal directed behaviour but PL lesioned respond like habit
- in high training shams show habitual response but IL lesions continue to show devaluation effects showing goal-directed behaviour, but PL lesions = habitually
- lesions to PL = always habit based
- lesions to IL = always goal based. need IL cortex to transition to habit behaviour
- damage to prelimbic regions of pfc impair sensitivity to outcome devaluation. prelimbic area important for encoding associations
- damage to infralimbic region impairs development of habits. inactivation of IL after overtraining still results in goal directed behaviour/
- IL region not important for Sr learning but for suppressing influence of s-o-r associations
9
Q
what do these regions tell us about behaviour?
A
- behaviour is dependent on both r-o associations and s-r associations.
- The amount of experience determines whether behaviour is determined primarily by one or the other learning system
- when behaviour becomes habitual it is not because the r-o association has been lost
- habitual goal-directed behaviour has been suppressed
10
Q
LTP and spatial learning
A
- GluA1 knockout mice:
> reduce no. and function of AMPA receptors
> impaired synaptic plasticity (early, rapid form of LTP)
> impaired STM (short-term habituation)
> intact associative LTM
> intact water maze learning
> t-maze preference task - spatial wm impaired
> failed to show novelty preference with short pre-exposure trials. intact when pre exposure was 24hr before
> shows LT associative memory intact after 24 hours