Lecture 5 - The neural basis of goal-directed and habit-based behaviour Flashcards

1
Q

How does an action become learned?

A
  • Thorndike puzzle box - learning is more of a gradual process and figuring it out through gradual trial and error learning (accidental) and then repeat the action because it has been reinforced
  • the shape of the learning curve suggests learning is an incremental process reflecting the strength of an association between stimulus and response
  • thorndike law of effect - if in presence of a s timulus a response is followed by satisfying outcome the s-r connection is strengthened. if followed by annoying/aversive outcome connection will weaken
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

S-R vs R-O associations

A
  • A stimulus-response association results in responding without any expectation that the response leads to an outcome (habit based learning)
  • a response-outcome association results in responding because there is the expectation that the response will result in the outcome (goal directed)
  • can tell by using outcome devaluation test - is there knowledge and expectation of outcome? if yes it is goal-directed.
    > train with reward
    > devalue reward via massive amounts
    > test if they press lever
    > if they continue to press = habitual, if less = goal directed
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Baxter and Murray 2002

A
  • in training task 1 the monkey gets cherries for choosing correct object and nothing for incorrect object
  • in training task 2 the monkey gets peanuts for choosing correct object and nothing for incorrect.
  • prior to test monkey allowed cherries or peanuts until sated. in test monkey is allowed to choose one of the correct objects
  • monkey chooses the object associated with the non-devalued reward
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

adams and dickinson 1981

A
  • LiCl is taste aversion stimuli
  • when devalue sucrose, they avoid lever 1 (paired with sucrose) amd massively press lever 2 instead. and vice versa = instrumental learning
  • other factors also involved: manipulated training length
  • even after taste aversion, putting extended training group into test do not show devaluation of lever, press each equally
  • limited training = goal directed
  • extended training = more habitual
  • instrumental learning is initially sensitive to outcome devaluation (goal directed) if overtrained it is no longer sensitive to outcome devaluation
  • the expression of different components of the association underlying behaviour changes with repetition
  • outcome devaluation procedures useful in:
    > OCD: patients show insensitivity to outcome devaluation
    > stress less sensitivity to outcome devaluation
    > actions learned under influence of alcohol are less sensitive to outcome devaluation = more habitual
    > alcohol seeking behaviour becomes insensitive to devaluation if over trained e.g. seen in alcohol misuse disorder
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

cortico-striatal system

A
  • dorsomedial striatum - goal directed behaviour
  • basolateral amygdala - incentive value
  • medial pfc - infralimbic vs prelimbic, encoding associations, SR learning
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

dorsomedial striatum

A
  • the role of posterior dorsomedial striatum in R-O learning:
  • Yin et al (2005)
    > sham group still had active dorsomedial striatum. show classic R-O.
    > lesion group pressing less levers overall but no difference between devalue and non-devalued levers - dorsomedial striatum important in goal directed learning
    > extinction test: if animal doesnt make the response for devalued outcome (but do for non-devalued) it must be that they expect a response to cause outcome (goal directed)
    > but if an animal is impaired on the extinction test it may be because they cannot discriminate between rewards
    > in exitinction test any change in lever pressing due to devaluation test is due to expectation & knowledge of outcome (R-O)
    > in rewarded test any change in lever pressing caused by impact of devalued outcome during test phase on s-r learning
  • dorsomedial inactivation
    > control shows reward evaluation in both tests
    > lesion - in extinction (requires expectation) failed to change lever pressing but can learn outcome of pressing = reward. can tell dif
    > dorsomedial striatum needed for goal-directed R-O learning. rats that received inactivation were not sensitive to outcome devaluation in extinction test. It is the outcome expectation this is important for
  • rats that received inactivation behaved in habitual manner
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

basolateral amygdala

A
  • representing the sensory specific properties of an outcome (can discrim between rewarded outcomes)
  • not necessary for acquiring instrumental responding suggesting S-R learning occurs independent of BLA function
  • is necessary for knowing the current incentive value of an outcome
  • Balleine et al (2003) - shams and BLA lesion able to learn lever press and chain pulling at same degree. lever may lead to pellet, chain lead to sweeter reward.
  • when devalue one reward, sham do the valued action not devalued, but BLA do both similarly
  • ## BLA necessary for the outcome representation and using sensory specific info to learn which outcomes are rewarded and which are not. BLA lesion cannot tell which outcome is rewarded.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

medial pfc

A
  • consists of the infralimbic cortex and the prelimbic cortex. these have dissociable roles.
  • killcross & couureau (2003) - pfc links to overtraining leading to habit learning. compared infralimbic vs prelimbic cortex lesions and sham in low (goal directed) and high training (habit learning) in outcome devaluation test
  • sham show goal directed in low training, IL group shows same goal directed behaviour but PL lesioned respond like habit
  • in high training shams show habitual response but IL lesions continue to show devaluation effects showing goal-directed behaviour, but PL lesions = habitually
  • lesions to PL = always habit based
  • lesions to IL = always goal based. need IL cortex to transition to habit behaviour
  • damage to prelimbic regions of pfc impair sensitivity to outcome devaluation. prelimbic area important for encoding associations
  • damage to infralimbic region impairs development of habits. inactivation of IL after overtraining still results in goal directed behaviour/
  • IL region not important for Sr learning but for suppressing influence of s-o-r associations
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

what do these regions tell us about behaviour?

A
  • behaviour is dependent on both r-o associations and s-r associations.
  • The amount of experience determines whether behaviour is determined primarily by one or the other learning system
  • when behaviour becomes habitual it is not because the r-o association has been lost
  • habitual goal-directed behaviour has been suppressed
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

LTP and spatial learning

A
  • GluA1 knockout mice:
    > reduce no. and function of AMPA receptors
    > impaired synaptic plasticity (early, rapid form of LTP)
    > impaired STM (short-term habituation)
    > intact associative LTM
    > intact water maze learning
    > t-maze preference task - spatial wm impaired
    > failed to show novelty preference with short pre-exposure trials. intact when pre exposure was 24hr before
    > shows LT associative memory intact after 24 hours
How well did you know this?
1
Not at all
2
3
4
5
Perfectly