Task 6 - Instrumental Conditioning Flashcards

1
Q

Operant conditioning

A

process whereby organisms learn to make or refrain from making certain responses in order to obtain or avoid certain outcomes
example: Thorndikes puzzle box

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Reinforcement

A

this process of providing an outcome for a behaviour that increases the probability of that behaviour

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

when deciding whether paradigm is operant or classical

A
  • -> focus on the outcome
  • when the outcome happens regardless –> classical
  • when the outcome only happens by chance (if one does something) –> operant
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Free-operant paradigm

A

animal could operate the apparatus freely, whenever it chose (f.e. when Thorndike added a return ramp to his puzzle box)`

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Discrete trials paradigm

A

trials were controlled by the experimenter

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Skinner box

A

he devised the cage – with a trough in one wall through which food could be delivered automatically

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Cumulative recorder

A

A learning curve drawn by a pen that moves across a roll of paper at a steady rate, increasing its vertical height by a fixed amount for every response of an organism, such as a lever press by a rat in a Skinner box or a peck by a pigeon of an illuminated plastic key – f.e. odometer in the car

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Discriminative Stimuli

A

stimuli that signal whether a particular response will lead to a particular outcome
–> they help the learner discriminate or distinguish the conditions where a response will be followed by a particular outcome

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Discriminative Stimuli

A

stimuli that signal whether a particular response will lead to a particular outcome
–> they help the learner discriminate or distinguish the conditions where a response will be followed by a particular outcome

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Shaping (??)

A

in which successive approximations to the desired response are reinforced

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Chaining (–> backward chaining)

A

technique in which organisms are gradually trained to execute sequences of discrete responses

  • related technique to shaping
  • -> sometimes more effective to train the steps in the reverse order
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Chaining (–> backward chaining)

A

technique in which organisms are gradually trained to execute sequences of discrete responses

  • related technique to shaping
  • -> sometimes more effective to train the steps in the reverse order
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Reinforcer

A

is a consequence of behavior that leads to increased likelihood of that behavior in the future

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Primary reinforcers

A

they are of biological value to the organism, and therefore organisms will tend to repeat behaviors that provide access to these things
- examples: Food, water, sleep, the need to maintain a comfortable temperature, and sex

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Drive reduction theory (Clark Hull)

A

proposed that all learning reflects the innate, biological need to obtain primary reinforcers
–> complication: primary reinforcers are not always reinforcing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

secondary reinforcers

A

reinforcers that initially have no biological value, but that have been paired with (or predict the arrival of) primary reinforcers (can be as strongly encouraging as primary enforcers)
– Example: money

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Token economies

A

often used in prisons, psychiatric hospitals, and other institutions where the staff has to motivate inmates or patients to behave well and to perform chores such as making beds or taking medications

  • tokens function in the same way as money does in the outside world
  • Animals as well will work for secondary reinforcers
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

negative contrast:

A

organisms given a less-preferred reinforcer in place of an expected and preferred reinforcer will respond less strongly for the less-preferred reinforcer than if they had been given that less-preferred reinforcer all along
– F.e. the monkey that throws the cucumber because it is the less preferred food, once he saw the grapes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Punishment

A

the process of providing outcomes for behaviour that decrease the probability of that behaviour – the response decreases

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Punishers or negative outcomes

A

common punishers for animals include pain, confinement, and exposure to predators (or even the scent of predators)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Four most important factors that determine how effective the punishment will be

A
  1. Punishment leads to more variable behaviour.
  2. Discriminative stimuli for punishment can encourage cheating
  3. Concurrent reinforcement can undermine the punishment
  4. Initial intensity matters
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Differential reinforcement of alternative behaviors (DRA)

A

A process – rather than delivering punishment each time the unwanted behaviour is exhibited, it’s possible to reward preferred, alternate behaviours

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Reinforcement schedules

A

the rules determining when outcomes are delivered in an experiment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Timing affects learning

A

Normally, immediate outcomes produce the fastest learning

Delays undermine the punishments effectiveness, and may weaken learning

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Response consequence delay

A

the longer one waits to punish something/someone the less the association will be made between the punishment and the …

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

Self-control

A

an organism’s willingness to forego a small immediate reward in favor of a larger future reward

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Positive (reinforcement)

A

positive does not mean good → instead it means added

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

Positive reinforcement

A

the desired response causes the reinforcer to be added to the environment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

Positive punishment

A

an undesired response causes a punisher to be added to the environment

30
Q

Negative reinforcement

A

behaviour is encouraged (reinforced) because it causes something to be subtracted from the environment – over time the response becomes more frequent
– sometimes called avoidance training

31
Q

Negative punishment

A

something is subtracted (negative) from the environment, and this subtraction punishes the behavior
– sometimes called omission training

32
Q

Negative (reinforcement)

A

negative does not mean bad, it means subtraction in a mathematical sense

33
Q

Reinforcement / Punishment
Positive / Negative
(Definition)

A

the terms reinforcement and punishment describe whether the response increases (reinforcement) or decreases (punishment) as a result of training. the terms positive and negative describe whether the outcome is added (positive) or taken away (negative)

34
Q

Partial reinforcement schedules

A

patterns in which an outcome follows a response less than 100 percent of the time
– Example: Becky has to clean her room seven days in a row to obtain her weekly allowance (seven responses for one reinforcement)

35
Q

Four types of partial reinforcement:

A
  1. Fixed-ratio (FR) schedule
  2. Fixed-interval (FI) schedule
  3. Variable-ratio (VR) schedule
  4. Variable-interval (VI) schedule
36
Q
  1. Fixed-ratio (FR) schedule
A

In operant conditioning, a reinforcement schedule in which a specific number of responses are required before a reinforcer is delivered; for example, FR 5 means that reinforcement arrives after every fifth response

37
Q

Postreinforcement pause

A

In operant conditioning with a fixed-ratio (FR) schedule of reinforcement, a brief pause following a period of fast responding leading to reinforcement.
It just happens – the animal takes a break – the longer the organism is doing the response the longer the pause will be

38
Q
  1. Fixed-interval (FI) schedule
A

an FI schedule reinforces the first response after a fixed amount of time

39
Q
  1. Variable-ratio (VR) schedule
A

a VR schedule provides reinforcement after a certain average number of responses
–> as a result, there is a steady, high rate of responding even immediately after a reinforcement is delivered, because the very next response just might result in another reinforcement

40
Q
  1. Variable-interval (VI) schedule
A

a VI schedule reinforces the first response after an interval that averages a particular length of time – VI schedules tend to produce higher rates of responding than FI schedules (more reinforcing than the fixed-ratio)
The interval schedules are better than the ratio

41
Q

Concurrent reinforcement schedules

A

in which the organism can make any of several possible responses, each leading to a different outcome
– Linked to behavioural economic –> how they use their time and resources

42
Q

Matching law of choice behaviour

A

the principle that an organism, given a choice between multiple responses, will make a particular response at a rate proportional to how often that response is reinforced relative to the other choices

43
Q

Behavioural economics

A

the study of how organisms allocate their time and resources among possible options
– economic theory predicts that each consumer will allocate resources in a way that maximizes her “subjective value,” or relative satisfaction. (in microeconomics, the word utility is used instead of subjective value.) the value is subjective because it differs from person to person
Pigeon could either get a reinforcer after a minute or two pellets after 2 min

44
Q

Bliss point

A

the particular allocation of resources that provides maximal subjective value to an individual
- Changes depending on context

45
Q

Premack principle

A

The theory that the opportunity to perform a highly frequent behavior can reinforce a less frequent behavior; later refined as the response deprivation hypothesis.

    • Example: if you have been studying for several hours straight, the idea of “taking a break” to clean your room or do the laundry can begin to look downright attractive
    • Rats want to run on their wheel
46
Q

Response deprivation hypothesis

A

a refinement of the Premack principle stating that the opportunity to perform any behaviour can be reinforcing if access to that behaviour is restricted → want something because you can’t have it

47
Q

Basal ganglia

A

collection of ganglia (cluster of neurons) information from the sensory cortex to the motor cortex can also travel via this indirect route
One part of the basal ganglia is the dorsal striatum – which can be further subdivided into the caudate nucleus and the putamen

48
Q

dorsal striatum

A

receives highly processed stimulus information from sensory cortical areas and projects to the motor cortex, which produces a behavioral response
– Plays a critical role in operant conditioning, particularly if discriminative stimuli are involved
Rats with lesions of the dorsal striatum can learn operant responses (e.g., when placed in a skinner box, lever-press R to obtain food O). But if discriminative stimuli are added (e.g., lever-press r is reinforced only in the presence of a light sd), then the lesioned rats are markedly impaired – similar to people that have a disruption to the striatum due to Parkinson’s disease or huntington’s disease
→ the dorsal striatum appears necessary for learning SD → R associations based on feedback about reinforcement and punishment

49
Q

Orbitofrontal cortex

A

appears to contribute to goal-directed behavior by representing predicted outcomes

    • receives inputs conveying the full range of sensory modalities (sight, touch, sound, etc.) and also visceral sensations (including hunger and thirst), allowing this brain area to integrate many types of information;
    • outputs from the orbitofrontal cortex travel to the striatum, where they can help determine which motor responses are executed
50
Q

?? is this also right ??

A

First projects from the sensory cortex (stimulus) to → the orbitofrontal cortex (prediction) → then to the basal ganglia (SD → R association)→ then to the striatum (motor learning)→ then to the motor cortex (reaction)

51
Q

wanting and liking in the brain

A

later studies identified that rats would work for electrical stimulation in several brain areas, including the ventral tegmental area (VTA)

52
Q

Ventral tegmental area (VTA)

A

a small region in the midbrain of rats, humans, and other mammals – produces dopamine (wanting something) – can stimulate the VTA to get same effect as a reinforcer

53
Q

“pleasure centers”

A

some researchers inferred that the rats “liked” the stimulation, and the VTA and other areas of the brain where electrical stimulation was effective became informally known as “pleasure centers.”

54
Q

Anhedonia hypothesis

A

the incentive salience hypothesis proves this wrong – that wanting and liking is the same thing and that dopamine is for both

55
Q

Hedonic value

A

the subjective “goodness” of a reinforcer, or how much we like it

56
Q

Motivational value

A

meaning how much we “want” a reinforcer and how hard we are willing to work to obtain it

57
Q

Incentive salience hypothesis

A

The hypothesis that dopamine helps provide organisms with the motivation to work for reinforcement – states that the role of dopamine in operant conditioning is to signal how much the animal “wants” a particular outcome—how motivated it is to work for it

58
Q

Endogenous opioids

A

brain chemicals that are naturally occurring neurotransmitter-like substances (peptides) with many of the same effects as opiate drugs

59
Q

how do “wanting” and “liking” interact

A

Possible way that the two brain systems (of liking and wanting) interact: differences in the amount of endogenous opioid released, and in the specific opiate receptors they activate, may help determine an organism’s preference for one reinforcer over another

60
Q

Pathological addiction

A

a strong habit that is maintained despite harmful consequences

addiction may involve not only seeking the “high” but also avoiding the adverse effects of withdrawal from the drug. in a sense, the high provides a positive reinforcement, and the avoidance of withdrawal symptoms provides a negative reinforcement—and both processes reinforce the drug-taking responses

61
Q

Behavioural addictions

A

are addictions to behaviour, rather than drugs, that produce reinforcements or highs, as well as cravings and withdrawal symptoms when the behaviour is prevented
– Perhaps the most widely agreed-upon example of a behavioral addiction is compulsive gambling

62
Q

Detoxification

A

taking a different drug instead – like drinking alcohol free beer

63
Q

Extinction

A

if response R stops producing outcome o, the frequency of r should decline

64
Q

Distancing

A

avoiding the stimuli that trigger the unwanted response

65
Q

Differential reinforcement of alternate behaviours (DRA)

A

reinforce yourself for example with a spa day if you didn’t use the drug or punish yourself if you did use the drug

66
Q

Delayed reinforcement

A

whenever the smoker gets the urge to light up, she can impose a fixed delay (e.g., an hour) before giving in to it

67
Q

most effective treatments

A

combine cognitive therapy (including counseling and support groups) with behavioral therapy based on conditioning principles—and medication for the most extreme cases

68
Q

Protestant ethic effect (NOT sure if this is right)!!!

A

delusional – would rather work for their food then get it freely – you think that you do something for an effect– vs habit slip

69
Q

Reward prediction hypothesis (ask about this again)

A

the firing of dopamine

70
Q

Reward prediction hypothesis (ask about this again)

A

the firing of dopamine

– the phasic activity of dopaminergic neurons in the midbrain signals a discrepancy between the predicted and currently experienced reward of a particular event