Learning Part 2: Operant Conditioning Flashcards

You may prefer our related Brainscape-certified flashcards:
1
Q

Operant Conditioning

A

Goal directed behaviour

operant conditioning is concerned with how environmental stimuli shape complex goal-directed behaviours?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Edward Thorndike

A

His experiments, conducted at the turn of the 20th century, paved the way for a behaviourist account of voluntary behaviour

He worked with different animals: e.g. chicks, cats and dogs

He wanted to find out whether animals use reasoning to solve problems

Famous for Thorndike’s puzzle box

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Thorndike’s puzzle box

A

Thorndike’s puzzle box: a cat was placed inside a puzzle box and food is placed outside of the box
Is the cat able to work out a mechanism to open the door of the box to obtain the food?

Results:
The cat learned by trial and error (and success): first attempts are random, then it stumbled across solution

Cats became faster on subsequent trials in the same puzzle box

Cats learn to associate response with rewarding consequence

Consequences shape behaviour: unsuccessful responses are gradually eliminated

The conclusion is that cats learn simple stimulus-response (S-R) associations rather than complex reasoning processes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Law of Effect

A

Responses followed by a satisfying state of affairs are strengthened and are more likely to occur again (rewards)

Responses followed by an annoying or unsatisfactory state of affairs are weakened and are unlikely to occur again (punishment)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

B.F Skinner (1904-1990)

A

He was influenced by Thonrndike’s work describing voluntary human behaviour using basic S-R associations and without resorting to mentalistic concepts

“Behaviour operates on the environment to generate consequences.”

Organisms learn which behaviours are emitted to earn rewards or avoid punishments

Operant describes any active (voluntary) behaviour that is produced in order to generate consequences, or is instrumental in generating consequences

Essentially everyone is trying to gain something desired or avoid something unpleasant

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

B.F Skinner (consequences shape behaviour)

A

consequences shape behaviour: unsuccessful responses are gradually eliminated

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Reinforcement:

A

Reinforcement occurs when the consequences of an action increase the likelihood of the action being repeated

Reinforcement increases or strengthens the occurrence of a behavior in the future

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Positive reinforcement +

A

Stimulus or event which, when presented as a consequence of a behaviour, increases the likelihood of that behaviour recurring in the future

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Negative reinforcement -

A

Stimulus or event which, when reduced or terminated, increases the likelihood that an associated behavior will recur

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Continuous reinforcement

A

Each response is reinforced

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Partial reinforcement

A

Reinforcement is given only for some correct responses

Generates behavior that persists longer: learners keep “testing” for a reward

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Fixed ratio schedule

A

Rewarded after a fixed number of correct responses

high rate of responding

faster responses yield quicker payoffs (“bursts”)
e.g. paid for producing a specific number of items

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Variable ratio schedule

A

Rewarded after an average number of correct responses

high rate of responding: persistent responding

People/ animals hope that the next response will bring reward
e.g. gambling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Fixed interval schedule

A

Reinforcement for first correct response after a fixed time period

Flurry of responding right before a reward is due
e.g. test scheduled every four weeks

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Variable interval schedule

A

Rewarded for first correct response after an average time period

Less predictable

Slow but steady pattern of responding (“testing”)
e.g. surprise quizzes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Shaping

A

Learning more complex behaviours by reinforcing successive approximations to the desired behaviour:

Reinforce high frequency component of desired response

Drop reinforcement – behaviour becomes more variable again

Await response that is still close to desired response – then reintroduce reinforcement

keep cycling: closer approximations are achieved

Shaping of behaviour which is not in the animal’s natural repertoire

17
Q

Extinction

A

Extinction occurs when reinforcement is withheld

It is not an immediate process, often brief increase in responding

Partially reinforced responses are harder to extinguish

18
Q

Punishment

A

The use of aversive consequences to reduce undesirable behavior

Any event which decreases the likelihood that ongoing behaviour will recur

19
Q

Positive punishment +

A

Behaviour is followed by the presentation of an aversive stimulus

Stimulus is added to situation
e.g. electric shock

20
Q

Negative punishment -

A

Behaviour is followed by withdrawal of rewarding stimulus

Stimulus is taken away
e.g. removal of toys

21
Q

problems associated with Punishment

A

Punishment is more effective when it is swift (no delay) and consistent (not just administered sometimes)

It is less effective than reinforcement because no desired behaviour is established

It does not cause long-term behaviour change: suppression of behaviour

When threat of punishment is removed, the behaviour returns (e.g. speed cameras)

It produces negative feelings and does not promote new learning

It may indeed teach the recipient to use punishment towards others

It is useful if behaviour is dangerous and must be changed/suppressed quickly

22
Q

Operant Conditioning: Children

A

Reinforce alternative behaviour that is incompatible with the undesirable behaviour (e.g. respond to normal voice only, not to screaming)

Identify the crucial reinforcer (maintaining the behaviour) and stop reinforcing the problem behaviour (extinction)

Reinforce the non-occurrence of the undesirable behaviour

Remove the opportunity for positive reinforcement

Use strongly reinforcing stimuli, but use variety (e.g. praise, privileges)

Immediate reinforcement after the preferred behaviour

Start with reinforcing all the time, switch to intermittent

Encourage self-reinforcement through pride and a sense of self-control

23
Q

Martin Seligman (Learned Helplessness)

A

He investigated the effects of exposure to uncontrollable shock on escape/avoidance learning in dogs

1/3 of dogs exposed to unavoidable shock failed to learn to avoid or escape from an unpleasant or aversive stimulus

first phase: Classical Conditioning
- shock paired with light
second phase: Operant Conditioning
- learn to jump when light is switched on to the other side of the box

24
Q

Basic Principles of Learned Helplessness

A

Learned helplessness might explain behaviour after abuse and in depression

When the traumatic event first occurs it causes a heightened state of emotionality, which has been called “fear“

Fear continues until the subject learns that he can or cannot control the trauma

“If subject learns that he cannot control the traumatic event, fear decreases and is replaced with depression.” (Seligman, 1979)