L6 Habits vs Goal directed Flashcards by Liam Brant

What is the difference between a habit and goal directed behaviour?

Habit = response triggered directly by environmental stimuli. The stimulus response link operates independently of goal.

Goal directed behaviour = action outcome learning, where actions driven by outcome/goal, and value of the outcome

How well did you know this?

Not at all

Perfectly

Skills differ from habits. They involve _________ and _________ through repeated ____________, whereas habits are more ____________ and less influenced by the goal or ___________.

Skills involved precision/accuracy, mastred through repetition and practice. Habits are more automatic and less influenced by goal/outcome.

How well did you know this?

Not at all

Perfectly

Why might habits have formed, from an evolutionary persepctive?

As a way to free cognitive resources for mornovel and important tasks. Therefore habit behaviour comes about automatically, rather than necessarily having to consciously perform action every time.

How well did you know this?

Not at all

Perfectly

What are the two main ways of testing if habits have formed, experimentally?

Contingency Degradation
Goal devaluation

How well did you know this?

Not at all

Perfectly

Contingency degradation involves ____________________________________________________ to see if ____________________.

Contingency degradation involves worsening contingency between stimulus and outcome to see if habit still persists

How well did you know this?

Not at all

Perfectly

Goal devaluation involves devaluating the ___________ either by __________ or perhaps by _______________________, to see if __________________.

Goal devaluation - devaluating outcome by reducing it, or by making at aversive, to see if animals still act when see the stimulus, so can tel if habit has formed.

How well did you know this?

Not at all

Perfectly

Model based learning is ____________________________________________,
Whereas model free learning is _____________________________________.

Model based = making cognitive mental maps with flexible alternatives based on outcomes, able to adjust, planning ahead.
Model free = bases decisions in previous instances, i.e win stay, lose shift evaluations, trial and error learning.

How well did you know this?

Not at all

Perfectly

What are two main neuropsychiatric disorder which may be formed as a dysfunction of habit learning?

A Addictions and Depression
B OCD and Anxiety
C OCD and Addictions
D Psychopathy and Sociopathy

C OCD and Addictions

How well did you know this?

Not at all

Perfectly

There is evidence that habits can be ____________ to goal directed learning, and certain factors may ____________ this ____________. It is more likely that Skills/goal directed learning and habits have _____________ circuits rather than completely ___________ circuits.

Habits can be transferred to goal directed learning, with certain factors modulating this transfer. Therefore it is more likely that habits and goal directed learning have overlapping neural circuits rather than completely distinct pathways.

How well did you know this?

Not at all

Perfectly

What are the main 3 types of associative learning, between stimuli, responses, and outcomes?

Stimuli -outcome learning, is classical conditioning/pavlovian, where learner learns to associate stimuli with an outcome.

Stimuli-response is habit learning, where stimuli automatically evokes response.

Response-outcome is goal directed learning, where learner learns which outcome will follow a response/action.

How well did you know this?

Not at all

Perfectly

Instrumental learning is defined as a change in ____________ due to the _____________ relationship between the _________ and a _____________ important stimulus. Provide an example?

Change in behaviour due to causal relationship between behaviour and a biologically important stimulus.

Rat learns jumping will lead to food. Therefore behaviour change 9jumps more) due to causal relationship with food.

How well did you know this?

Not at all

Perfectly

Thorndikes law of effect states what about positive vs negative reinforces?

Positive reinforcers (food) strengthen relationship between stimulus and response.
Ngeative reinforcers (shock) weaken relationship between stimulus and response

How well did you know this?

Not at all

Perfectly

In instrumental actions, the actor has an _________ to execute the behaviour, a _________ about the effect of the behaviour on the __________, and a ________ for the outcome.

Intention to act
Belief about causal relationship between action and outcome
Desire for the outcome

How well did you know this?

Not at all

Perfectly

How can researchers study if animals act on desire or purely automatically out of habit?

Test desire by reducing desirability of outcome by making it aversive. If animals still respond for outcome it is habit, and no desire for outcome involved.

How well did you know this?

Not at all

Perfectly

What were the 3 phases of Dickinson and Adams (1981) outcome devaluation experiment?
What was the results

1 Instrumental learning - rats learned sugar pellet contingent on lever press, and food pellet non contingent, presented regardless of lever press.
2 Outcome devlauation - one group had contingent sugar pellet paired with LiCl, whereas one group had non contingent sugar pellet paired with LiCl
3 Extinction - test response, where lever press with no outcome.
Group who learned instrumental response for food pressed much less in extinction and higher sensitivity for outcome devaluation. Shows that they learn goal directed behaviour.

How well did you know this?

Not at all

Perfectly

What are the two main methods of achieving outcome devaluation?

By making reward outcome aversive - animal will show devaluation by responding less than before
By satiating the reward outcome - giving animal large quantity of reward makes reward less desirable and less valuable, rat will respond less than before.

Both satiation and outcome devaluation experiments show how rats have __________-___________ behaviours, as by reducing the ________ of an outcome, rats are less willing to __________ for it, and have reduced ________ for it.

Outcome devaluation and satiation show goal directed behaviours in rats - change or reduction in value of outcome reduces behaviour to achieve the outcome ,as is no longer as desirable.

How does contingency degradation (Hammond, 1980) show rats have a belief about their actions?

By worsening contingency, and providing outcome, without any response, rats will reduce their behaviour. This shows that they have a belief about the probability that their behaviour will lead to a reward outcome. If contingency is degraded , they believe that their response is less likely to produce outcome so no longer do it.

What are the 4 main experimental manipulations/factors which may modulate habits or goal directed learning?

Amount of training
Schedule of reinforcement
Choice
Contiguity

After a small _________of ______________, behaviours are _________-________, yet after many many trials, behaviour may become _____________ and automatic.

After small amount of training, behaviours are goal directed. yet after many many trials, behaviours may become automatic and habitual.

How did Adams show the difference between habits and goal directed learning, by manipulating amount of training (100 vs 500 trials)?

Had 2 groups, q who had 100 pellet training trials, and another who had 500.
Then carried out outcome devaluation using LiCl.
Then did extinction test. Found that 500 trial group pressed lever press more despite devaluation using LiCl showing habit formation, whereas 100 trial group showed goal directed learning by reduced lever pressing.

In the context of reward learning, what is the difference between ratio and interval reinforcement schedules?

Ratio shcedules - environment where resources are constantly replesnished. More visits = more rewards, as each outcome dependent on action.

Interval schedules - environment where resources deplete, but regenerate after some time. can either be a fixed or variable interval schedule. More visits does not = more rewards.

What did Dickinson (1983) find when studying the difference between habits and goal directed learning, with ration vs interval schedules of reinforcement?

tested two different groups, ratio vs interval schedules of reinforcement. Then did outcome devaluation, the extinction tests.
Ratio schedules - with higher action outcome learning, lead to more goal directed learning, with outcome devaluation reducing responding.
In interval schedules, naturally lower responding, as low action outcome correlation. however outcome devaluation did not reduce responding, and responding out of habit.

True or false, Rescorla and Colwill failed to replicate Adam’s findings on amount of training, when rats had a choice.

True

Explain the 2 different groups in kosaki and Dickinson's 'choice' study?

Choice group - had a choice between tow different levers. one lever lead to one type of food, say sugar, and one lever lead to one type of food, say food pellet. Both rewards were contingent on lever pressing. Non contingent group - one food dependent on lever press, food pellet, but sugar was noncontingent, and for free. There was no choice between them. Both groups had one reward intact and one devalued, then tested for extinction.

What were the findings of Koasaki and Dickinson's choice vs no choice study, after devaluation and extinction test was carried out in both groups?

For choice group, devaluation reduces responding of the devalued reward stimulus. Suggesting given yhe choice to put effort into each one, will choose valued over devalued, showing goal directed learning. For non contingent group, devaluation of stimulus did not reduce responding. Suggesting habitual.

In real life actions are not always ________ met with ___________________, and consequences take ___________

Actions not always instantly met with outcomes, and consequences take time.

Urcelay and Jonkman showed that in low contiguity group, outcome devaluation ___________________________ responding, suggesting _____________, whereas in high contiguity group, outcome devaluation ___________ responding, suggesting _________-_____________ learning.

Low contiguity - outcome devaluation had no effect on responding, suggesting habits. High contiguity - outcome devaluation reduced responding, suggesting goal directed learning.

When is an action said to become goal directed, and how is this shown in experimental manipulations?

When an actor has the opportunity to experience correlations between its actions and the occurrence of their outcome.

The dual process computational account by Perez and Dickinson sasys that the _________-___________- system and the __________ system operate simultaneously, and the animal computes ______________ about the response being ______________ and outcome _______________

Goal directed system and habit system operate simulataneously. In goal directed system animal computes response-outcome rate correlation, and the reward, to decide whether top act or not. In habit system habit strength accumulated by reward-prediction error.

A dual process model accounts for factors like ___________ and __________ of _________ which may determine when a behaviour is ____________ or _________-__________.

Account for things like correlation and amount of training, which determine when a behaviour is habitual or goal directed.

How did Gillan et al study learning and outcome devlautaion in OCD and healthy populations, comparing extended vs small amounts of learning?

Had participants linked up to shock. Could carry out avoidance response to get rid of shock, by pressing button. Button to avoid was unplugged, outcome devaluation, and participants learnt this. As well as reacquisition. This was normal amount of training. Then did the same thing, unplugging the avoidance button to devaluate outcome of response, but this time after extended training.

What were the main 2 findings from Gillan et al in their outcome devaluation study in OCD vs healthy, comparing extended vs short training? What does this suggest for OCD patients?

Both learning (acquisition) and devaluation after small amount of training occurred the same for OCD and healthy people, and both reduced responding after outcome did not lead to shock avoidance. However, unlike healthy pop, OCD patients were unaffected by outcome devaluation following extended trainng. Button to avoid was still pressed even tho did not avoid shock. Suggests that OCD people are less sensitive to outcome devaluation, and are more lilely to form habits given extended training.

Everitt et al (2001) suggests that addictions may form as a _______________ from goal directed behaviour to ____________, which are exacerbated by the drugs/stimulus effects on _________ circuits involving release of ____________.

Addictions form as transfer from goal directed behaviour to habits, which are exacerbate by drugs/stimulus effects on reward circuits and dopamine release.

What is true of the study by Corbitt et al, testing devaluation of alcohol vs glucose using satiation? A Over time, devaluation effect of alcohol wore off, unlike glucose, which was still devaluated after 8 weeks. Suggests habit forms easier with alcohol. B Over time, devaluation effect of glucose wore off, unlike alcohol, which was still devaluated after 8 weeks. Suggests habit forms easier with glucose. C Administering alcohol with glucose, reduces devaluation effects of glucose, and also speeds up habit formation of glucose. D both A and C

D both A and C