Ch. 4: Reinforcement Flashcards by Anastasia Jean

Define Operant, Learning & Operant Learning

Operant: A class of behaviour that operates on the environment to produce a common environmental consequence.
Learning: A change in behaviour due to experience
Operant Learning: A change in a class of behaviour as a function of the consequences that followed it.

Learning = conditioning

How well did you know this?

Not at all

Perfectly

5 ways of Reinforcing consequences?

Increase frequency, duration, intensity, quickness, variability

How well did you know this?

Not at all

Perfectly

Two Ways of Reinforcing

Add a stimulus + positive reinforcement
Remove a stimulus - Negative Reinforcement

How well did you know this?

Not at all

Perfectly

Reward vs. Reinforcer

Reward does not equate to reinforcer

Ex. Giving a dog a treat for rolling over, later asking the dog to roll over and he doesn’t means the reward didn’t function as a reinforcer.

How well did you know this?

Not at all

Perfectly

Plateau

Maximum amount of behaviour that can be conceivably admitted. Behaviour can never exceed a probability of one.

How well did you know this?

Not at all

Perfectly

Is Reinforcement a theory?

No reinforcement is not a theory. It is a functional description. It is not circular.

How well did you know this?

Not at all

Perfectly

What are the two types of reinforcement?

Unconditional (Primary) Reinforcer
Conditional (Secondary) Reinforcer

How well did you know this?

Not at all

Perfectly

Unconditional (primary) reinforcer

A motivating stimulus that does not need to be learned, such as food, water, warmth, oxygen, shelter, etc.
-depends on some amount of deprivation
-Often species specific

How well did you know this?

Not at all

Perfectly

Conditional (Secondary) Reinforcer

Stimuli, objects, or events that become reinforcing based on their association with a primary reinforcer.

Ex. A dog isn’t born wanting to sit on cue, but when sitting is paired with primary reinforcers such as treats or social interaction, it becomes a secondary reinforcer.

How well did you know this?

Not at all

Perfectly

Liberman, et al. 1973

Institutionalize patients with schizophrenia. Reinforced “rational talk” by positive interactions with patients and did not reinforce “irrational talk” by having negative interactions with patients.

How well did you know this?

Not at all

Perfectly

Conditional Reinforcement

Conditional reinforcement is when something becomes rewarding because it’s linked to a real reward.

For example, if you give a dog a treat every time you click a button, the dog will start to like the sound of the click because it knows a treat is coming. The click becomes a conditional reinforcer.

How well did you know this?

Not at all

Perfectly

Contingency

The degree of correlation between a behaviour and its consequence.

How well did you know this?

Not at all

Perfectly

Reinforcement variables: Contiguity

Nearness of events in time (temporal contiguity) or space (spatial contiguity).
High contiguity often referred to “pairing”
Less contiguity (longer delays) between the operant response and the reinforcer, diminishes the effectiveness of the reinforcer.

How well did you know this?

Not at all

Perfectly

Reinforcement variables: Temporal vs spatial contiguity

Temporal contiguity: means that two things happen close together in time. In learning, it refers to how closely in time a behavior and its consequence (like a reward or punishment) are linked. The closer they happen, the stronger the connection the brain makes between them. For example, if you give a dog a treat right after it sits, it’s more likely to connect sitting with getting the treat because the two events are closely timed.

Spatial contiguity: means that two things happen close together in space. In learning, it refers to how close things are physically when you’re trying to learn something. For example, if words and pictures are shown next to each other on a page, it’s easier to learn because they’re close together. The brain connects them more easily when they’re near each other.

How well did you know this?

Not at all

Perfectly

Hyperbolic decay function

Describes how something (like a reward or value) becomes less important or less impactful as time passes, but not in a straight line. Instead, it drops quickly at first, then slows down over time.

In simple terms, it’s like saying, “The longer you wait for something, the less you care about it, but that drop in how much you care happens fast at first and then slows down later.”

For example, if you’re waiting for a reward, you might be really excited at first, but the longer you wait, the less excited you get, though your excitement doesn’t disappear completely.

How well did you know this?

Not at all

Perfectly

Reinforcer Characteristics: Reinforcer Magnitude

Study These Flashcards

Generally, larger reinforcers are more reinforcing than smaller reinforcers.
-Relation between size and effectiveness is not linear.
-Generally, the more you increase magnitude, the less benefit you get from the increase.
-Effectiveness of unconditional reinforcers tends to diminish quickly.

Reinforcer characteristics: specific reinforcer

Study These Flashcards

Ex: chocolate is yummier than sunflower seeds

Reinforcer Characteristics: Task characteristics

Study These Flashcards

In behavior modification, “task characteristics” as part of reinforcer characteristics refer to how the specific qualities of a task influence how effective a reward (reinforcer) will be. For example, how difficult, complex, or interesting the task is can affect how well a reward works to encourage someone to do it. Tasks that are easy and clear may need smaller reinforcers, while challenging or boring tasks might need stronger rewards to keep someone motivated.

Ex: getting a pigeon to peck for food vs a hawk peck for food

Reinforcer Characteristics: Motivating operations

Study These Flashcards

Establishing: increases effectiveness
Ex: deprivation

Abolishing: decreases effectiveness
Ex: satiation

Reinforcement Characteristics: Competing contingencies

Study These Flashcards

When there are two or more possible outcomes or consequences for a behavior, and you have to choose between them. Each outcome might have different rewards or punishments, and you weigh which one is more important to you.

CHOICE: Allocation of time among two or more activities.

Ex: Should I watch YouTube or study?

Premack Principle

Study These Flashcards

High-probability behaviour reinforces low-probability behaviour.

The idea that a more enjoyable activity can be used as a reward for doing a less enjoyable activity. In simple terms, it means “If you do something you don’t like first, you get to do something you really enjoy afterward.”

For example, if a child doesn’t like doing homework but loves playing video games, you can say, “First finish your homework, then you can play video games.” The fun activity (video games) motivates them to do the less fun activity (homework).

Problems with Premack Principle

Study These Flashcards

-Doesn’t account for conditional reinforcement effects.
-Low probability behaviour can reinforce high probability behaviour when the organism has been deprived of the low probability behaviour

Schedule of Reinforcement

Study These Flashcards

A rule that describes the delivery of reinforcement. Different schedules produce unique schedule effects. The effects become predictable. Occur in numerous species.

Cumulative Record

Study These Flashcards

A plot of cumulative responses (y-axis) over time (x-axis).

It is a visual way of showing how often a behavior happens over time. Imagine a graph where the line goes up every time the behavior is done. The steeper the line, the more the behavior is happening. If the line is flat, it means the behavior isn’t happening at all.

Frequency vs. Cumulative Frequency

-Frequency= How many times something happens in a single period. -Cumulative frequency= The total number, adding up each period’s frequency. Frequency is the number of times something happens in a specific time or group. For example, if you count how many times you eat pizza in a week, that number is the frequency. Cumulative frequency is the running total of how often something happens, adding up each time. For example, if you track how many times you eat pizza each week for a month, and add up each week’s total as you go, that’s cumulative frequency.

Schedules of Reinforcement: Continuous Reinforcement (CRF) Schedule

-Behaviour is reinforced each time it occurs -Rate of behaviour increases rapidly (good for new behaviours) -Rare in the natural environment Ex. A child is praised every time they clean

Intermittent Reinforcement Schedule

When a reward or reinforcement is given **only sometimes** after a behavior, not every time. Ex: gambling 4 main types: -Fixed-ratio (FR) -Variable-ratio (VR) -Fixed-Interval (FI) -Variable-Interval (VI)

Intermittent reinforcement: Fixed-Ratio Schedule

Behaviour is reinforced after a fixed number of times. Important ex: FR-120… pigeon has to peck 120 times for reinforcement Generates Post-reinforcement pause which typically increases with ratio size and reinforcer magnitude Generates steady run rates following the PRP.

Intermittent Reinforcement: Variable-Ratio Schedule

-Ratio-requirement varies around an average. Ex: VR-360 if there were 739 responses and the mean was 360 -Works with shuffles ordering Ex. From each response it is randomly sorted. So the pigeon may have to peck 30 times first to get food, then 720, then 180… -Less post-reinforcement pause- very rare and short. -Produces higher rates -Common in natural environments Has two common variations: random ratio and progressive ratio

Intermittent Reinforcement: Variable-Ratio Schedule (Random-Ratio)

-The schedule is controlled by a random number generator -Produces similarly high rates of responding -Type of ratio used in casino games & video games

Intermittent Reinforcement: Variable-Ratio Schedule (Progressive ratio)

-ratio requirements move from small to large -ex. 123456789… -Post reinforcement pauses (PRPs) increase with ratio size -Creates a “break-point” measure of how hard organism will work.

Intermittent Reinforcement: Fixed-Interval Schedule

-Behaviour is reinforced when it occurs after a given period of time. Ex: FI-4 minutes… delivered after 4 minutes -Produces Post reinforcement pause -responding increases gradually producing a “scallop” shape -uncommon in natural environment

Intermittent Reinforcement: Variable-Interval Schedule

-Interval varies around an average. Ex: VI-3 min intervals (in seconds) -PRPs are rare and short -steady rates of responding- not as high as a VR -common in natural environments

Fixed/Variable Duration

-Reinforcer is contingent on continuous performance for a period of time. Ex: practicing guitar for 30 min -Many people use these schedules but provide no reinforcer.

Ch. 4: Reinforcement Flashcards

(34 cards)