3A: Operant Conditioning and Schedules of Reinforcement Flashcards

1
Q

Principles of operant conditioning

A

Involves the strengthening or weakening of a behaviour as a result of the consequences.
Behaviours are voluntary or goal-directed.
The consequence of the behaviour affects future occurrences of that behaviour.
Reinforcers strengthen behaviours, punishers reduce a behaviour.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Law of effect

A

Behaviour is controlled by its consequences.
Behaviours that result in pleasant consequences will be more likely in the future.
Behaviours that result in unpleasant consequences will be less likely in the future.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Two types of behaviours

A

Reflexive type: involuntary, named respondent behaviour.

Operant: voluntary, behaviours controlled by consequences.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Operant antecedents

A

Discriminative stimulus (S^D): Indicates that a response will be followed by a contingency (reinforcer or punisher) e.g. light signals pressing a lever will now produce food.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Positive reinforcement

A

When behaviour is strengthened because it is followed by a reinforcing or rewarding stimulus
e.g. smile at someone (R) -> person smiles back (S^R)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Negative reinforcement

A

When behaviour is strengthened because it is followed by the removal of an aversive stimulus
e.g. Take a panadol (R) -> eliminate a headache (S^R)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Escape learning

A

Learning of a response that allows a subject to escape an aversive stimulus (e.g. switch off an electric shock).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Avoidance learning

A

Learning of a response that allows a subject to avoid an aversive stimulus.
e.g. learning that when a light comes on the shock is about to start and they much press the bar to prevent the shock.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Operant learning

A

Any procedure or experience in which a behavior becomes stronger or weaker (e.g., more or less likely to occur), depending on its consequences. Also called instrumental learning.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Using positive reinforcement

A

It is important for learning that an organism wants to take part in activities and learns new skills via desired behaviours, not because it is scare of a consequence/being punished.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Primary reinforcers

A

Unlearned
Inherently reinforcing because they satisfy a biological need (e.g. food, water)
Unconditioned reinforcers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Secondary reinforcers

A

Conditioned reinforcers

Are learnt or become reinforcers after being associated with primary reinforcers (e.g. money)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Natural reinforcers

A

Any reinforcer that is the spontaneous consequence of a behavior. Also called automatic reinforcer.
e.g. brush your teeth in the morning and morning breath goes away.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Contrived reinforcer

A

Any reinforcer that is provided by someone for the purpose of changing behavior.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Contingency

A

The extent to which the behaviour and the consequence are correlated.
The stronger the correlation, the more effective the reinforcer is likely to be.
e.g. if likely to get food by not pressing lever, won’t continue pressing lever.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Contiguity

A

The gap between a behaviour and its consequence.
In general, the shorter the interval the faster learning occurs.
Usually, if too long left between, can cause confusion.
Some learning can occur despite a delay, however.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Reinforcer characteristics

A

Some reinforcers work better than others.
The size and the strength of the reinforcer can impact conditioning.
Generally, a large reinforcer will be more effective than a small one. BUT frequent small reinforcers may work better.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Behaviour characteristics

A

Certain aspects of a behaviour may be easier to learn than others.
Remember: task difficulty will vary with species and that it is easier to train/teach behaviours that are somewhat aligned to an animals natural behaviour.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Motivating operations

A

Anything that changes the effectiveness of a consequence - either in terms of increasing or decreasing its effectiveness.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Establishing operations

A

Increase the effectiveness of a consequence.

The greater the deprivation the more powerful the reinforcer e.g. food.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Abolishing operations

A

Decrease the effectiveness of a consequence.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Drive reduction theory

A

The event is reinforcing if it is associated with a reduction of a physiological drive (primary reinforcer).
Not comprehensive enough, unsatisfactory explination of reinforcers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Premack’s principle

A

Helps us understand what can be used as a reinforcer.
High probability behaviour can be used to reinforce a low probability behaviour.
Reinforcers as behaviours and reinforcement as a sequence of two behaviours:
1. Behaviour being reinforced
2. Behaviour that is the reinforcer
e.g. rats running get a drink

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Response deprivation hypothesis

A

The theory of reinforcement that says a behavior is reinforcing to the extent that the organism has been deprived (relative to its baseline frequency) of performing that behavior.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Behavioural bliss point approach

A

An organism that has free access to alternative activities will organise its behaviour to maximise its overall (optimal) reinforcement.

26
Q

Escape behaviour (theory of avoidance)

A

Performing a behaviour stops an aversive stimulus, and as such strengthens that behaviour

27
Q

Avoidance behaviour (theory of avoidance)

A

Performing a behaviour prevents an aversive stimulus from happening, and as such strengthens that behaviour.

28
Q

Two-process theory of avoidance

A

The view that avoidance and punishment involves two procedures—classical conditioning (fear response) and operant learning (negative reinforcement).

29
Q

Two-process theory of avoidance problem

A

Avoidance responses can be extremely persistent.
Possible explanation: anxiety conservation hypothesis - avoidance behaviours occur so quickly that there is insufficient exposure to the CS for extinction to take place.

30
Q

One-process theory

A

Escape and avoidance behaviours are reinforced by the reduction of the aversive stimulus.

31
Q

Acquisition

A

The initial stage of learning - learning a pattern of responding or the association between behaviour and reinforcer.
A gradual process that requires shaping.

32
Q

Shaping

A

The reinforcement of closer and closer approximations of the desired behaviour.
Important when subject does not on its own perform the desired behaviour.

33
Q

Extinction

A

The gradual weakening and elimination of the response tendency.
Achieved through halting the reinforcement. The time this takes depends on how resistant the subject is to extinction.
If the response ceases, it has been extinguished.

34
Q

Chaining

A

Training a person or animal to perform a sequence of behaviours.
Involves breaking down a behaviour or sequence into its components using task analysis. Then reinforcing the performance of each component.

35
Q

Forward chaining

A

Reinforce the first component, then when it is performed we add the second component reinforcing performance of the two together until this is completed without hesitation, then add the third and so on.

36
Q

Backward chaining

A

Starting with the last link in the chain and building towards the first component.
This is often the more efficient and easier approach.

37
Q

Continuous reinforcement (CRF)

A

Every occurance of an operant response is followed by a reinforcer.

38
Q

Intermittent reinforcement

A

Only some occurrences of the operant response are followed by a reinforcer.
Close alignment with life.
Steady-state behaviours emerge once there has been considerable exposure to the schedule.

39
Q

Fixed ratio schedule (FR)

A

Reinforcement depends upon a fixed/predictable number of responses emitted since the last reinforcer.
FR4 = the 4th response is followed by reinforcement.
Post-reinforcement pause.
Low resistance to extinction

40
Q

Variable ratio schedule (VR)

A

Reinforcement depends upon a variable/unpredictable number of responses emitted since the last reinforcer.
High and steady response rates
Little or no post-reinforcement pauses.
High resistance to extinction.

41
Q

Fixed interval schedule (FI)

A

A response is reinforced when a fixed/predictable period of time has elapsed since the last reinforcer.
Scallop pattern of responding
Post-reinforcement pause
Low resistance to extinction

42
Q

Variable interval schedule (VI)

A

A response is reinforced when a variable/unpredictable period of time has elapsed since the last reinforcer.
Moderate-steady rate of responding
No post-reinforcement pause
High resistance to extinction

43
Q

Extinction burst

A

Temporary increase in frequency and intensity of responding when implemented

44
Q

Side effects of extinction

A
Extinction burst
Increase in variability
Emotional behaviour
Aggression
Resurgence
Depression
45
Q

Resurgence

A

Unusual but like regression - reappearance of previously successful behaviour

46
Q

Spontaneous recovery

A

Reappearance of extinguished response after rest period

Repeated effects required for learning due to presence of discriminative stimulus.

47
Q

Differential reinforcement of other behaviour (DRO)

A

Simultaneously extinguish behaviour while reinforcing alternative behaviour
No deprivation of reinforcement in the setting, thus reducing likely side effects and can achieve desired outcome but for alternativ behavioural processes.

48
Q

Duration schedules

A

A behaviour muse be performed continuously for a period of time (either fixed or variable)

49
Q

Time schedules

A

A reinforcer is delivered after a period of time (either fixed or variable) regardless of what behaviour occurs.

50
Q

Progressive schedules

A

The requirement for the reinforcement increases in a predetermined way following each reinforcement

51
Q

Chained schedules

A

Sequence of simple schedules in a specific order

52
Q

Multiple schedules

A

A mix of 2 or more simple schedules

53
Q

Mixed schedules

A

Requirements for reinforcement are a combination of two or more simple schedules

54
Q

Cooperative schudules

A

Reinforcement is contingent on the behaviour of two or more individuals

55
Q

Concurrent schedules

A

Two or more schedules are available at once and the individual or animal must choose between them

56
Q

The discrimination hypothesis

A

Extinction takes longer after intermittent reinforcement because it is harder to discriminate between an intermittent schedule and extinction, than it is to discriminate between continuous reinforcement and an extinction procedure.

57
Q

The frustration hypothesis

A

Non-reinforcement of a previously reinforced behaviour is frustrating and as frustration is an aversive state, anything that reduces frustration will be reinforcing.
In a partial reinforcement schedule, performing the behaviour becomes a reinforcer for reducing frustration and as such continues during a phase of extinction.

58
Q

The sequential hypothesis

A

The idea that the partial reinforcement effect occurs because the sequence of reinforced and nonreinforced behaviors during intermittent reinforcement becomes a signal for responding during extinction.

59
Q

The response unit hypothesis

A

The partial reinforcement effect is due to differences in the definition of a behaviour during intermittent and continuous reinforcement.

60
Q

Matching law

A

The principle that, given the opportunity to respond on two or more reinforcement schedules, the rate of responding on each schedule will match the reinforcement available on each schedule.
e.g. 10% of reward, 10% use this option.

61
Q

Melioration Theory

A

Distribution of behaviour in a choice situation shifts toward those alternatives that have higher value regardless of the long-term effect on the overall amount of reinforcement