Ch. 4: Reinforcement Flashcards

You may prefer our related Brainscape-certified flashcards:
1
Q

Define Operant, Learning & Operant Learning

A

Operant: A class of behaviour that operates on the environment to produce a common environmental consequence.
Learning: A change in behaviour due to experience
Operant Learning: A change in a class of behaviour as a function of the consequences that followed it.

Learning = conditioning

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Reinforcing consequences

A

Increase frequency, duration, intensity, quickness, variability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Two Ways of Reinforcing

A
  1. Add a stimulus + positive reinforcement
  2. Remove a stimulus - Negative Reinforcement
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Reward vs. Reinforcer

A

Reward does not equate to reinforcer

Ex. Giving a dog a treat for rolling over, later asking the dog to roll over and he doesn’t means the reward didn’t function as a reinforcer.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Plateau

A

Maximum amount of behaviour that can be conceivably admitted. Behaviour can never exceed a probability of one.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Is Reinforcement a theory?

A

No reinforcement is not a theory. It is a functional description. It is not circular.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are the two types of reinforcement?

A
  1. Unconditional (Primary) Reinforcer
  2. Conditional (Secondary) Reinforcer
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Unconditional (primary) reinforcer

A

A motivating stimulus that does not need to be learned, such as food, water, warmth, oxygen, shelter, etc.
-depends on some amount of deprivation
-Often species specific

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Conditional (Secondary) Reinforcer

A

Stimuli, objects, or events that become reinforcing based on their association with a primary reinforcer.

Ex. A dog isn’t born wanting to sit on cue, but when sitting is paired with primary reinforcers such as treats or social interaction, it becomes a secondary reinforcer.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Liberman, et al. 1973

A

Institutionalize patients with schizophrenia. Reinforced “rational talk” by positive interactions with patients and did not reinforce “irrational talk” by having negative interactions with patients.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Conditional Reinforcement

A

Conditional reinforcement is when something becomes rewarding because it’s linked to a real reward.

For example, if you give a dog a treat every time you click a button, the dog will start to like the sound of the click because it knows a treat is coming. The click becomes a conditional reinforcer.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Contingency

A

The degree of correlation between a behaviour and its consequence.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Reinforcement variables: Contiguity

A

Nearness of events in time (temporal contiguity) or space (spatial contiguity).
High contiguity often referred to “pairing”
Less contiguity (longer delays) between the operant response and the reinforcer, diminishes the effectiveness of the reinforcer.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Reinforcement variables: Temporal vs spatial contiguity

A

Temporal contiguity: means that two things happen close together in time. In learning, it refers to how closely in time a behavior and its consequence (like a reward or punishment) are linked. The closer they happen, the stronger the connection the brain makes between them. For example, if you give a dog a treat right after it sits, it’s more likely to connect sitting with getting the treat because the two events are closely timed.

Spatial contiguity: means that two things happen close together in space. In learning, it refers to how close things are physically when you’re trying to learn something. For example, if words and pictures are shown next to each other on a page, it’s easier to learn because they’re close together. The brain connects them more easily when they’re near each other.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Hyperbolic decay function

A

Describes how something (like a reward or value) becomes less important or less impactful as time passes, but not in a straight line. Instead, it drops quickly at first, then slows down over time.

In simple terms, it’s like saying, “The longer you wait for something, the less you care about it, but that drop in how much you care happens fast at first and then slows down later.”

For example, if you’re waiting for a reward, you might be really excited at first, but the longer you wait, the less excited you get, though your excitement doesn’t disappear completely.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Reinforcer Characteristics: Reinforcer Magnitude

A

Generally, larger reinforcers are more reinforcing than smaller reinforcers.
-Relation between size and effectiveness is not linear.
-Generally, the more you increase magnitude, the less benefit you get from the increase.
-Effectiveness of unconditional reinforcers tends to diminish quickly.

17
Q

Reinforcer characteristics: specific reinforcer

A

Ex: chocolate is yummier than sunflower seeds

18
Q

Reinforcer Characteristics: Task characteristics

A

Ex: getting a pigeon to peck for food vs a hawk peck for food

19
Q

Reinforcer Characteristics: Motivating operations

A

Establishing: increases effectiveness
Ex: deprivation

Abolishing: decreases effectiveness
Ex: satiation

20
Q

Reinforcement Characteristics: Competing contingencies

A

When there are two or more possible outcomes or consequences for a behavior, and you have to choose between them. Each outcome might have different rewards or punishments, and you weigh which one is more important to you.

CHOICE: Allocation of time among two or more activities.

Ex: Should I watch YouTube or study?

21
Q

Premack Principle

A

High-probability behaviour reinforces low-probability behaviour.

The idea that a more enjoyable activity can be used as a reward for doing a less enjoyable activity. In simple terms, it means “If you do something you don’t like first, you get to do something you really enjoy afterward.”

For example, if a child doesn’t like doing homework but loves playing video games, you can say, “First finish your homework, then you can play video games.” The fun activity (video games) motivates them to do the less fun activity (homework).

22
Q

Problems with Premack Principle

A

-Doesn’t account for conditional reinforcement effects.
-Low probability behaviour can reinforce high probability behaviour when the organism has been deprived of the low probability behaviour

23
Q

Schedule of Reinforcement

A

A rule that describes the delivery of reinforcement. Different schedules produce unique schedule effects. The effects become predictable. Occur in numerous species.

24
Q

Cumulative Record

A

A plot of cumulative responses (y-axis) over time (x-axis).

It is a visual way of showing how often a behavior happens over time. Imagine a graph where the line goes up every time the behavior is done. The steeper the line, the more the behavior is happening. If the line is flat, it means the behavior isn’t happening at all.

25
Q

Frequency vs. Cumulative Frequency

A

-Frequency= How many times something happens in a single period.
-Cumulative frequency= The total number, adding up each period’s frequency.

Frequency is the number of times something happens in a specific time or group. For example, if you count how many times you eat pizza in a week, that number is the frequency.

Cumulative frequency is the running total of how often something happens, adding up each time. For example, if you track how many times you eat pizza each week for a month, and add up each week’s total as you go, that’s cumulative frequency.

26
Q

Schedules of Reinforcement: Continuous Reinforcement (CRF) Schedule

A

-Behaviour is reinforced each time it occurs
-Rate of behaviour increases rapidly (good for new behaviours)
-Rare in the natural environment

Ex. A child is praised every time they clean

27
Q

Intermittent Reinforcement Schedule

A

When a reward or reinforcement is given only sometimes after a behavior, not every time.

Ex: gambling

4 main types:
-Fixed-ratio (FR)
-Variable-ratio (VR)
-Fixed-Interval (FI)
-Variable-Interval (VI)

28
Q

Intermittent reinforcement: Fixed-Ratio Schedule

A

Behaviour is reinforced after a fixed number of times.
Important ex: FR-120… pigeon has to peck 120 times for reinforcement

Generates Post-reinforcement pause which typically increases with ratio size and reinforcer magnitude

Generates steady run rates following the PRP.

29
Q

Intermittent Reinforcement: Variable-Ratio Schedule

A

-Ratio-requirement varies around an average.
Ex: VR-360 if there were 739 responses and the mean was 360
-Works with shuffles ordering
Ex. From each response it is randomly sorted. So the pigeon may have to peck 30 times first to get food, then 720, then 180…
-Less post-reinforcement pause- very rare and short.
-Produces higher rates
-Common in natural environments

Has two common variations: random ratio and progressive ratio

30
Q

Intermittent Reinforcement: Variable-Ratio Schedule (Random-Ratio)

A

-The schedule is controlled by a random number generator
-Produces similarly high rates of responding
-Type of ratio used in casino games & video games

31
Q

Intermittent Reinforcement: Variable-Ratio Schedule (Progressive ratio)

A

-ratio requirements move from small to large
-ex. 123456789…
-Post reinforcement pauses (PRPs) increase with ratio size
-Creates a “break-point” measure of how hard organism will work.

32
Q

Intermittent Reinforcement: Fixed-Interval Schedule

A

-Behaviour is reinforced when it occurs after a given period of time.
Ex: FI-4 minutes… delivered after 4 minutes
-Produces Post reinforcement pause
-responding increases gradually producing a “scallop” shape
-uncommon in natural environment

33
Q

Intermittent Reinforcement: Variable-Interval Schedule

A

-Interval varies around an average.
Ex: VI-3 min intervals (in seconds)
-PRPs are rare and short
-steady rates of responding- not as high as a VR
-common in natural environments

34
Q

Fixed/Variable Duration

A

-Reinforcer is contingent on continuous performance for a period of time.
Ex: practicing guitar for 30 min
-Many people use these schedules but provide no reinforcer.