L3: Instrumental conditioning part 2 Flashcards

1
Q

conditioned reinforces?

A
  • Primary reinforcers: e.g. satisfy basic drives
  • Secondary reinforcers: conditioned reinforcers (classical conditioning)
  • Increase the generalisability of instrumental conditioning
    E.g: animal in cage press lever gets food. Satisfy basic food drive. (primary reinfrocers)
    Secondary reinforces, sound before giving the food. Sound predicts food will come. Increases generalisation of instrumental conditioning.
    Aquisition phase
    Extinction phase and spontaneous recovery
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

conditioned reinforcers (negative)

A

Getting rid of something:
* Which is innately aversive (primary negative reinforcers)
* Which has become aversive through learning (secondary negative reinforcers)

Primary negative reinforcers e.g: dogs and electric shock from last lecture
Secondary negative reinforcers: if kid gets nagging from carer (punishment). When removing the stressors this could stimulate the kid to perform what the kid wants to do? Or should do?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

punishment

A

Positive punishment - shouting at bf.
Negative punishment - no hugs to bf. Removing something.

Punishment

Must be immediate, inevitable, severe but…
* Suppresses behaviour but ineffective in the long run
* Indicates what should not do, not what they should do
* When threat is removed behaviour is reinstated
* Emotional by-products (e.g. becoming fearful and fear generalising to contextual
stimuli)
* Justifies inflicting pain on others - Replaces one undesirable response with another
e.g. elicits aggression towards the punishing agent and others

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

factors affecting reinforcer effectiveness?

A

Amount of reinforcemet - in the animal example. If give small amount of food might not elicit behaviour.
Motivational need for reinforcer (drive)- if you are doing an experiment with food and give food with full animal wont work very well. ?
Contrast effect- compare with eachother? Maybe happy with salary but learn colleague receives more then start being upset about yours (contrast effect). Experimenter can check baseline of response of mouse pressing lever to get small food. In diff day experimenter gives alot of food. Then gives little bit then animal not as motivated as before.
Delay of reinforcement- time is important. If long delay might not reinforce the right behaviour.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

schedules of reinforcement?

A

Fixed-ratio- when animal recieves reinforcement after certain amount of trials. E.g: after 5 times pressing lever will get lever. O
Fixed-interveal: e.g: after 5 mins pushin lever, animal gets food.
Variable-ratio- random. Animal receives reinforcement after random amount of trials. Animal more engaged in behaviour?
Variable-interval- after variable interval of time animal gets reward.
E.g: variable-ratio. Gambling. Slot machine. Reward can come any time but person keeps going and going to get reward.
Fixed-interval. Seeing porter to see if there was mail. Dont see every 5 mins maybe once a day.
Variable-interval- emails. Keep refreshing to see if you have more emails. Can be in 5 mins, 1 min, 1hr etc.

  • Once a behaviour is established, in can be
    maintained by a schedule of partial
    reinforcement
  • Extinction following maintenance of a response
    on partial reinforcement is harder than
    continuous reinforcement
  • E.g. pigeons on fixed interval ratio (food every
    5 min) may peck 6000 per hour – takes days
    to extinguish
  • The most frequent and reliable responses are
    elicited when reinforcers are unpredictable and
    irregular (i.e. variable ratio). Hardest to be extincted.

look at notes theres graphs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

the relativity of reinforcement?

A

The relativity of reinforcement (David Premack)
* Responses (behaviour) can also be used as reinforcers
* Premack principle: If one activity occurs more frequently than another, it can be used to reinforce the
activity that occurs less often
* Stage 1: Observe freely behavioural organisms and create a hierarchy of behaviours
* E.g. a rat: eating > drinking > running in a wheel > grooming > looking out of the cage
* Stage 2: Use the more frequent activity as a reinforcer for less frequent activity

The relativity of reinforcement
* Premack (1959): tested 31 first grade children
* Stage 1: Children allowed to play a pinball machine or operate a candy
dispenser
* Classified children as “manipulators” or “eaters”
* Stage 2: Each group was placed on manipulate – eat contingencies (had to play
the pinball machine to be allowed to eat) and eat – manipulate contingencies
(the reverse)
* What was the effect of each contingency for each group of children?
Implications:
* Reinforcer is a very personal and dynamic thing
Issues:
* Ambiguity in definition of reinforcer (e.g. food)

Used playing to reinforce eating and eating to reinforce playing
If child is payer and given or giving? Cancer as a reinforcer- works
So for one kid candy is amazing and an amazing reinforcer but other couldnt care about candy.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

disequillibrium hypothesis?

A

Disequilibrium hypothesis (William Timberlake; Timberlake (1980)
* The proportional distribution of activities that an organisms engages in constitutes an
equilibrium
* Any activity can be a reinforcer if a contingency schedule restricts an animal’s access to that
activity
* Describes conditions under which a specific activity can be reinforcing or punishing

Running weel in rat can be used as reinforcer as long as it is removed from the cage, any activity reinforcer if there is a restriction to the access. Animal cage without running machine, experimenter can use this activity to reinforce the desireable behaviour.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

shaping?

A

The reinforcement of successive approximations to a final desired behaviour.
for instance: step 1- give the pigeon food when it turns towards the button, step 2 give it food as it walks towards the button, 3: give food when raises head to the height of the button, 4 give food when it taps the button with its beak.

Shaping and chaining: the Brelands’ animal behaviour enterprises
“Their success led to “Priscilla the Fastidious Pig.” According
to an article about the Breland’s training methods, not only
could Priscilla push a vacuum, she could turn on the radio,
put clothes in a hamper, eat breakfast at a table, answer quiz
questions, and select her favorite food—Larro, of course—
from that of competitors. From 1948 to 1950 Priscilla
performed her act at feed-stores, county fairs, and on
television. Except it wasn’t the same Priscilla. Every few
months, as Priscilla grew in size, she was replaced by a new
trained pig, one that was smaller and easier to ship.”

All the result of shaping
Were training multiple pigs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

biological boundries on learning?

A

Instinctive drift: behavioural predispositions limit the
range of novel behaviours
* E.g. racoons and dropping coins in a piggy bank
30
“Now the raccoon really had problems (and so did we).
Not only could he not let go of the coins, but he spent
seconds, even minutes, rubbing them together (in a most
miserly fashion), and dipping them into the container. He
carried on this behavior to such an extent that the
practical application we had in mind - a display featuring a
raccoon putting money in a piggy bank - simply was not
feasible. The rubbing behavior became worse and worse
as time went on, in spite of nonreinforcement.” (Breland &
Breland, 1961)

Biological boundries on learning (Petrinovich and Bolles, 1954)

Putting animal in a maze with 2 arms. Right arm or left arm. Put animal in starting point and natural behaviour is to explore. Experimenter puts food in right arm. So think animal would go to the right arm after the learning phase (conditioning). Eventually finds food. Next trial puts animal again. Animal goes to the left. In nature- normally he learns where he found food= thats it. Knows he has explored this area so goes to another area. So in this case, hard to condition the animal that food was in the right as natural disposition was to go to the opposite side.
Think about water. Experimenter puts water source on the right arm. Animal found the water. Experimenter replaces in start arm. In nature- water is normally a source thats not moving around river, puddle etc. so animal goes to the right.
Biological constraints of learned behaviour.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

cognitive factors?

A

Cognitive factors
* Instrumental conditioning occurs only when the organism perceives
a contingency between a response and an outcome
And that reinforcement is under its control
* Seligman and Learned Helplessness: “yoked” dogs learned that they are helpless in
Phase 1 → no escape/avoidance behaviour in Phase 2

Tolman and honzik (1948): cognitive maps
Latent learning: animals learn even though their behaviour may
not change (immediately) in a corresponding way
Group 1:
* rats were rewarded for finding their way through a maze
* rats became faster in solving the maze over a few days
Group 2:
* rats were not rewarded for exploring the maze
* little improvement in solving the maze
(no reinforcement)
Test:
* when reward was introduced, performance of Group 2
almost caught up with performance of group 1
➔ Rats had developed a cognitive map
➔ Learning occurred even when animal is not reinforced
Rats were learning even when animal not rewarded.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly