goal-directed behaviour and habits Flashcards

1
Q

stimuli to response learning

A

S-R habit learning

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

stimuli to outcomes learning

A

S-O pavlovian learning

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

response to outcome learning

A

A-O goal-directed learning

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

instrumental learning

A

a change in beh produced by a causal relationship between the beh & a biologically important stim

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

S-R habit learning

A
  • thorndike’s “law of effect”
  • +ve reinforcers strengthen the connection between a stim & the response
  • -ve reinforcers weaken the connection between a stim & the response
  • thorndike developed this theory based on how quick cats can get out of a box
  • presentations of the stim elicit the instrumental action as a response
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

instrumental actions

A
  • action outcome learning
  • “human actions are those behaviours that persons have chosen to perform, and perform for a reason” (Greve, 2001)
  • actor has an intention to execute the beh
  • has a belief about the causal relation between the action and the outcome
  • beh produces a desired outcome
  • intention when belief and desire met
  • cog theories implement this as a practical inference
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

adams & dickinson (1981)

outcome value

A
  • instrumental training - rats trained to press a lever to obtain a reward (brown pellet) whilst second rewards was presented noncontingently (sugar pellet)
  • outcome devaluation - for some rats (D-N) the brown pellet, which was presented during training, was paired with LiCl, whilst for other rats (N-D) the noncontingent pellet was paired with LiCl
  • extinction test - rats in both groups were tested for lever press beh on extinction
  • during extinction in D-N group pressed less than rats in N-D group –> sensitivity outcome devaluation
  • reinforced tests reveal that rats in both groups has learned the aversion in the devaluation phase
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

satiety specific outcome devaluation

A
  • 3 phases
  • training - rats pressed lever & obtain one type of pellet
  • before test on extinction, rats were pre-fed with the same pellets they were working for same or diff pellet
  • lever presses during extinction show less instrumental beh when rats were pre-fed with the pellets experienced during training relative to the diff pellets
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

contingency degradation

A
  • each time animal produces action: pellet
  • start to degrade relation by some actions not being followed by pellet
  • 50% relationship between action and pellet
  • unpaired outcome or noncontingent - give free outcomes
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

hammond (1980)

contingency degradation

A
  • sensitive to probability of outcomes of actions
  • probability of free outcomes increasing: dependent on other probability
  • calculate probabilities
  • instrumental conditioning linked to the two probabilities
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

variables that determine whether beh is G-D or habitual

A
  • amount of training
  • scheduales of reinforcement
  • choice
  • contiguity
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

S-R and A-O learning

A
  • A-O and S-R learning can be revealed by diff amounts of training
  • adams (1982) trained rats to lever press for sucrose pellets, but he varied the amount of training (100 vs 500 lever presses)
  • –> devalued the outcome & tested on extinction & on reacquisition
  • during test - rats with minimal training (100) showed goal-directed beh, less pressing in group devalued relative to the non-devalued control
  • rats that received long training, no ev of devaluation, indicative of habits
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

schedules of reinforcement

A
  • ratio schedules - model an environment in which resources are constantly replenished (unlimited)
  • interval schedules model an environment with depleting sources that regenerate after a fixed or variable interval
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

dickinson et al. (1983)

schedules of reinforcement

A
  • compared random ratio and variable interval schedules of reinforcement
  • used an outcome devaluation procedure in which they paired the O with sickness (LiCl)
  • outcome devaluation - pellet present during training was paired with LiCl in groups devalued (D) but not in control groups (N)
  • followed by a test of lever pressing on extinction
  • only rats trained with a ratio schedule showed an outcome devaluation (goal directed)
  • interval schedule lower - still not 0, they were habitual
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

amount of training - choice

A
  • Colwill & Rescorla (1985) attempted to replicate Adam’s (1982) findings & failed to find an effect of amount of training
  • trained concurrently two levers that resulted in the presentation of 2 different outcomes
  • they tested in a choice procedure
  • after training with choice procedure devalued consequence
  • varied amount of training with different levers
  • two levers & two rewards: choice, minimise habituation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

kosaki & dickinson (2010)

choice

A
  • noncontingent group only had one lever
  • same amount of total reinforcer - diff ways of achieving this
  • rate of responding lower if they had choice
  • all rats learned though
  • training with single lever - replicated results of Adams, didn’t matter if devalued
  • choice - still goal-directed
17
Q

contiguity - temporal closeness between A-O

A
  • most research investigating action-outcome learning has focused on immediate consequences
  • actions followed by delayed outcomes may better capture many of the decisions and actions that we make in our day-to-day activities
  • foraging animals make choices for consequences delayed in time
  • humans save for retirement, or for children’s education
  • scientists have ideas, pursue findings, do the research, and the publish the research many years down the line
18
Q

Urcelay & Jonkman (2019)

contiguity

A
  • assessed whether A-O contguity has an effect on sensitivity to OD (satiety-specific)
  • trained to press lever in 2 diff contexts with diff levers and pellets
  • one context: pellets followed immediately after lever press
  • other context: pellets presented 20s after a lever press
  • during outcome devaluation - 1 of the pellets was pre-fed & rats were immediately tested for lever pressing on extinction
  • repeated 4 times so that all rats were prefed with same or diff pellets in each context
  • outcome devaluation effect in immediate context
  • not in delayed context –> delayed outcomes facilitate outcome formation
19
Q

computational accounts

perez & dickinson (2020)

A
  • gaol-directed system: computes response-outcome rate corr & determine current G-D strength
  • habit system: accumulates habit strength by summed reward prediction-error
  • these two are summed
  • that determines whether a beh is emitted or not
20
Q

habit account of OCD

A
  • Gillan et al. (2014) used an avoidance task & compared OCD patients & controls in their sensitivity to outcome devaluation
  • after small amount of training
  • after extended training
  • training: no difference between sweat conduction across all these stimuli between controls and OCD patients
  • early devaluation test: both reduced responding for the sided that was disconnected, understood the task & devaluation procedure
  • devaluation after extended training: smaller devaluation in OCD patients, more likely to have an urge to respond
21
Q

a habit account of drug addiction

A
  • drugs of abuse excerpt their effects through neural systems involved in feeding & sexual behs
  • everitt et al. (2001) proposed that drug addiction can (in part) be understood as a transition from goal-directed to habitual beh that is exacerbated by the drug’s effects on similar neural systems as food
  • dopamine release in ventral striatum
  • develops into habit quicker than food reinforcer
22
Q

corbit et al. (2012)

habit account of drug addiction

A
  • trained rats to press lever for small amount of beer
  • kept training with either beer or sucrose
  • in one week they were goal directed
  • after two weeks they were still goal directed
  • after four weeks they were no longer goal directed
  • over time behaviour became habitual
  • even without training in between (bottom two graphs) this pattern is still seen
  • so not because they are used to devaluation task
23
Q

non-contingent alcohol facilitating habit formation for sucrose

corbit et al. (2012)

A
  • press for sucrose
  • behaviour goal-directed after two and eight weeks
  • work for sucrose but after the training given ethanol as well
  • after 8 weeks: sucrose with ethanol became habitual