Chapters 5-9 Flashcards
What did Thorndike test for?
Do animals possess intelligence?
The study of consequence
- did Cat & Puzzle Box experiments
- the cat leanred from the consequences of its actions
-
THE LAW OF EFFECT:
- any action has a consequence (positive or negative)
The Law of Effect
4 Key Elements:
Any action has a consequece (can be positive or negative)
4 Key elements (cause and effect chain of events):
- have environment
- have behavior
- change in environment after the behavior
-
change in ithe behavior after the change in environment
- DEMONSTRATES LEARNING HAS OCCURRED
_ex:_ puzzle box, push lever, door opens –> cat more likely to push lever
What did Skinner say about behavior?
- behavior is either strengthened** or _weakened _** by its consequences
- behavior operates in the environment–> behavior effects the environment
Said that there are 4 TYPES OF EXPERIENCES:
- 2 THAT STRENGTHEN BEHAVIOR
- reinforcement
- 2 THAT WEAKEN BEHAVIOR
- punishment
types of operant learning
2 that strengthen behavior: REINFORCEMENT
2 that weaken behavior: PUNISHMENT
what is reinforcement?
an increase in the srength of a behavior due to its consequences
to qualify as reinforcement:
- behavior must have a consequence
- behavior must increase in strength
- the increase must be a result of the consequence
positive reinforcement
a behavior causes appearnece or increase in intensity of a stimulus
stimulus: positive reinforcer (something animal seeks out)
“reward learning”
“add” something to increase behavior
negative reinforcement
behavior strengthened by the removal/ decrease in intensity of the stimulus
stimulus: _negative reinforcer _(usually something we want ot avoid)
“escape learning”
“escape avoidance learning”
how do we measure the strength of a behavior?
frequency
or
probability
of the behavior occurring
- or any other feature of the behavior (as long as its reinforced)*
(ex: duration, form, intensitym latency)
what is behavioral momentum?
behavior persists even AFTER punishment and other reinforcers
its hard to erase the learning
(“learning keeps building up”)
kinds of reinforcers
- primary: innately effective; don’t have to learn to like them (food, water, social)
vs.
-
_secondary: _a result of learning; conditioned reinforcer (applause, money)
- have no vsalue on their own, we give them value
-
generalized reinforcers: paired w/ different reinforcers to be used in differnt situations
- natural: events that psontaneously follow from a behavior
- contrived: events provided by someone in order to modify behavior; do not occur naturally; manipulation
what is satiation?
(“say-she-ay-shun”)
when a reinforcer loses its ability to be effective due to changes in the environment, or the value of the reinforcer
ex: secondary reinforcers satiate more slowly because they tend to be weaker and aren’t as disruptive; provide instant gratification (money)
operant conditioning
behavior causes an effect (event contingent on behavior)
the behavior is voluntary/flexible (can be manipulated)
classical conditioning
events connected to behavior
(event 1 is contingent on event 2)
behavior is reflexive
what is contingency?
X and Y occur together or not at all
the amount of correlation between behavior and its consequence
(how reliably the reinforcer follows behavior)
*in terms of reinforcement: *increase the likelihood of behavior happening again
what is contiguity?
the time gap between behavior and its consequence
shorter gap = faster learning
after a delay between the behavior and the consequence, you may inadvertently reinforce other behaviors
*ex: *press lever, then cat chases tail, then door opens–> cat thinks chasing tail will open door
characteristics of reinforcers
size: larger is better
qualitative differences (individual differences can determine effectiveness)
ex: ice cream not a good reinforcer for someone who is lactose intolerant
behavior characteristics
some behaviors are easier to learn than others
what we’re trying to teach influences how quickly & easily it’s learned
what are motivation operations?
anything that changes the effectiveness of a consequence
2 types
- establishing operations: **increase effectiveness
- abolishing operations: decrease effectiveness
neuromechanics of reinforcement
Olds & Milner–1950s
ESB (electrical stimulation of the brain)
shock rats brain
looked at reward pathway (pathway the deals w/ reinforcement; limbic system)
found that the reward pathway is dopamine rich w/ endorphins
stimulating dopamine receptors triggers a reward
THEORIES OF REINFORCEMENT:
positive
Drive reduction theory
drive: motivational states
reinforcers: events that reduce drives
- pros: works with primary reinforcers*
- cons: not as well for secodary reinforcers or ones that are hard to classify*
THEORIES OF REINFORCEMENT:
positive
Relative value theory
reinforcers aren’t things–>they’re BEHAVIORS
ex: the reinforcer isn’t the food, it’s EATING the food
behaviors have different relative values (“rahter be doing X or Y?”)
something with a higher relative value will reinforce better that somwthing with a low relative value
**comparing differnet behaviors and how much you’d rahter be doing them**
pros: no need for internal “drives”
cons: doesn’t consider secondary reinforcers; sometimes low probability behavior will still be reinforcing under normal conditions
THEORIES OF REINFORCEMENT:
positive
Response Deprivation Theory
compare behaviors to themselves, **not to each other
_baseline: _ amount of time spent engaging in behavior under normal condtions
when does a behaviors become a reinforcer?
- when the behavior is held below baseline value of how much you want to do something
- cons: *still issues w/ praise
THEORIES OF REINFORCEMENT:
negative
Two-Process Theory
**2 processes occurring **
BOTH operant and Pavlovian conditoning occur
escape from an aversive stimulus to learn
when CS loses its aversiveness, avoidance persists; extinction fails to occur
THEORIES OF REINFORCEMENT
negative
One-process Theory
keep operant conditioning (because it can explain everything)
the reduction in the shock is reinforcing
stop avoidance behavior by forcing it to stop
Shaping
a type of learning
reinforce simple behaviors close to what you want/the desired behavior
used to shape behavior that won’t occur spontaneously
behaviors vary: useful behaviors get selected & rewarded
(like natural selection; select for traits)
*rat & basketball video example*
how to shape
- don’t require too much at one time
- provide immediate reinforcement/rewards (latency=bad–> can reinf. wrong behavior)
- give small rewards
- reinforce the closet approximation of the end behavior
- back up when necessary
what is chaining?
forward and backward?
teaching individual to perform a behavior chain in order
behavior chain: **a series of connected actions
**forward chaining: **reinforce FIRST actions, then SECOND, etc
**backward chaining: **reinforce LAST action first, then second to last, etc
** the LAST ACTION in the chain is the most important**
steps in chaining:
same process sas shaping BUT reward individual after successfully completing EACH step
analyze chain–what are its parts?
break it into pieces
schedules of reinforcement
variation in reinforcement contengencies that follow a specifc rule (specific pattern)
fixed vs. variable
ratio vs. interval
ratios: # of behaviors to the # of reinforcements;
ratio increase=more # of behaviors you have to perform to get reinforcement
_interval: _ how much time has gone by between when you got the 1st reward and the 2nd reward; still have to perform behavior to get reward; (time elapsed between a reinforcement and the next behavior)
fixed ratio schedule (FR)
behavior reinforced when it has occurred a fixed # of times
produces post-reinforcement pauses: pauses b/t behavior after reinforcement given
variable ratio schedule (VR)
reinforcement given based on an average ratio of behaviors to reward
produce steady performance: pauses are rare/short
*never know when the reward is coming*
main difference from fixed ration schedule: the pauses (no pauses in variable ratio schedule because you don’t know when reward is coming, so you keep doing behavior to get reward)
fixed interval schedule (FI)
behavior reinforce ONLY if it occurs after a particular, constant interval
after a certain time interval, the next behavior is reiforced; any behavior in the meantime doesn’t count/doesn’t get rewarded
variable interval schedule (VI)
behavior is reinforced after a interval, BUT the interval varies around a particular time
after a time interval, the next behavior in reinforced
produces steady run rates (but not as steady as variable ratio schedules)
extinction
have a behavior that has been reinforced, but STOP reinforcing it for good
“FR infinity” (fixed ratio for forever)
ex: pigeon flaps wings=reward–> pigeon flaps wings=>no more rewards
extinction bursts
at first, behavior will suddenly increase
(pigeon flaps wings more!)
because they can’t figure out why they are no longer recieving rewards
resurgence
previously reinforced behaviors reappears from childhood
- variabliy increases after extinction*
ex: adult throws temper tantrum
spontaneous recovery
extinct bahevior reoccurs after time outside of the operant chamber
less common schedule types
duration (fixed or variable)
noncontingent schedules
progressive schedule (stretching ratio)
ratio strain
breaking poit (behavior stops completely)
compound schedules (more than one shcedule combined)
partial reinforcement effect
behavior reinforced on a schedule is MORE RESISTANT to extinction than continuously reinforced behavior
thinner reinforcement effect
bigger ratio/longer interval # = thinner ratio
something less reinforcing overall
discrimination hypothesis
thinner schedule is MORE DIFFICULT to distinguish between the schedule and the extinction process
frustration hypothesis
not getting a reward is frustrating
(because reinforcers reinforce behavior AND **feeling **too)
thinner schedule = greater frustration
sequential hypothesis
not getting reinforced every time is another cue to keep going because eventually the behavior will be reinforced
same as frustration hypothesis, BUT the cue to continue the behavior for reinforcemnt is from the environment (not frustration)
response-unit hypothesis
combination of behavior is ONE response unit
rewards happen after every unit
BUT unit are defined differently (grouped) (*box thingy*)
eliminate partial reinforcement effect
matching law
how one performs a behavior is directly related to how they get rewarded for it
thicker reinforcement schedule = rewarded more often
relative frequency of reinforcement directly related to the relative frequency of the reinforcement available
learn both ratio schedules and choose the richer one to be rewarded more often
only switch between the 2 to see if you’re about to get rewarded
what is punishment
the decrease in a behavir because of it consequences (opposite of reinforcement)
Thorndike concluded that we don;t learn from our failures
positive punishment
behavior weakened with the appearnace or increase in intensity of an aversive stimulus
ADD something to the environment
ex: yelling, arrest, ticket, water squirt
negative punishment
behavior weakened with the REMOVAL or REDUCED intensity of a pleasurable stimulus
take something away from the environment
ex: time-out
VARIABLES AFFECTING PUNISHMENT:
contingency
how dependent punishment is on behavior
more contingency = faster bahvior changes
higher contingency = faster extinction/fewer changes overall
VARIABLES AFFECTING PUNISHMENT:
contiguity
interval between behavior and punishing consequence
longer interval = less effective punishment
VARIABLES AFFECTING PUNISHMENT:
punisher intensity
more intense punisher = more reduction in behavior
VARIABLES AFFECTING PUNISHMENT:
introductory level
(level of intensity)
2 options for levels of intensity:
- start with WEAK, aversive stimulus and gradually increase it until the behavior stops
- can allow people to learn to toleratestimuli better (not good)
- start with STRONG, aversive stimulus and build up if you need to
- larger punishment may be necessary to learn assn and stop behavior much fatser
- hard to determine the level that’s “just right”
VARIABLES AFFECTING PUNISHMENT:
previous reinforcement
how strongly a behavior was reifnorces before it was punished
the more well-learned/reinforced a behavior is, the harder it is to punish it
VARIABLES AFFECTING PUNISHMENT:
alternative sources of reinforcement
if the behavior is the ONLY way to get reinforcement then punishment won’t work very well
if there’s another way to get reinforcement (food, water, attention), then punishment is very effective
(a way to get around previous reinforcement)
VARIABLES AFFECTING PUNISHMENT:
motivating operations
punishment is MORE effective if the REWARD is not as good
(ex: wheat vs. candy)
quality of previous reward vs. punishment
VARIABLES AFFECTING PUNISHMENT:
qualitive features (of the punishment itself)
some punisher are just BETTER than OTHERS
depends on the individual anf what they like
THEORIES OF PUNISHMENT:
TWO process theory
both operant and classical conditioning influence performance
THEORIES OF PUNISHMENT:
ONE process theory
only involves operant
low probability behavior should punish high probability behavior
“the way to go interms of punishment”
PROs of punishment
reinforcing
FAST
can permanently change behavior
CONs of punishment
has to be consistent; must punish every instance of behavior
can create extinction bursts
can be avoided; try to escape–> can lead to negative reinforcement
physical punishment
creates aggression toward the punisher (when escape is impossible)
displace aggression toward innocent others
imitation of the punisher
suppressed behavior if escape is impossible
ALTERNATIVES to PUNISHMENT:
response prevention
change the environment so that behavior cannot occur in the first place
ALTERNATIVES to PUNISHMENT:
extinction
must remove all reinforcers
ALTERNATIVES to PUNISHMENT:
differential reinforcement
3 types of differential reinforcement
-
DRA (differential reinforcement to alternative behavior)
- specifically reinforce something else
-
DRI (differential reinforcement of incompatible behavior)
- reinforce specific incompatible behavior (cant do 2 behaviors at 1 time)
-
DRL (differential reinforcement at a low rate)
- reinforce a lower rate of behavior
- reinforce someone for doing it less than they would normally do it
INFLUENCES OF OPERANT TREATMENTS:
home environment
development of secure attachment
neglect in orphanages or in general damages attachment formation
(environment if orphanages is non-responsive, so operant procedures not in effect; lose contengency)
crying–>not being picked up = learn that “no one is there”
in an ideal environment –> needs will be met
- associative learning: cry = help*
- ex: learning to speak via encouragement*
INFLUENCES OF OPERANT TREATMENTS:
school environment
reinforcement works to encourage **good behavior/performance in school **
move away frompunishment–>reinforce good behavir and ignore bad behavior
_DRL_ (children should learn how to earn attention)
reinforcement is the basis for internet-based learning
INFLUENCES OF OPERANT TREATMENTS:
clinic environment
self injurious behavior:
- before reinforcement, children were restrained
- use punishment as a treatment to decrease behavior
- alternatives to punishment: DRI; reiforcement can promote positive behavior
Delusions (false beliefs): can get worse with positive reinforcement; reinforce doubts in those false beliefs & provide alternative options
Transient paralysis: short term paralysis of limb
- neuroplasticity: brain recovers by rewiring itself so that other parts of barin can work affected limb
- contrain-induced movement therapy (remove learning of pain; shaping makes it work faster)
INFLUENCES OF OPERANT TREATMENTS:
work environment
improve worker performance
improve productivity with performance feedback
rewarding safety practices = fewer work accidents
INFLUENCES OF OPERANT TREATMENTS:
zoo environment
improve the vetinary care of animals with operant conditioning
helps with humane treatmets of animals
allow animals to “earn” food through operant reinforfcementlike they would in their natural habitat; reduces boredom and reinstates wilderness conditions
shape and reinforce behavior to care for animals