Learning Flashcards
Skinner & Thorndike
- operant conditioning
- behaviours are initially emitted randomly
- eventually, behaviours that are followed by pleasurable consequences will occur more, and behaviours followed by unpleasant consequences will occur less
Pavlov & Watson
- classical conditioning
- we learn new responses when things are connected or paired
unconditioned response
universal (all species respond the same way)
conditioned response
a learned response
backward conditioning (classical conditioning)
the unconditioned response precedes the neutral stimulus
(example - meat powder first, then the bell tone)
*no learning will occur, ineffective.
Suggests the contingency of stimuli is what accounts for classical cond.
standard conditioning
the conditioned stimulus precedes the unconditioned stimulus by a short interval (0.5 seconds) and overlaps
stimulus generalization
the subject automatically generalizes from a conditioned stimulus to another similar neutral stimulus (ex. Lil Albert startle response to other furry creatures not just white rats).
So.. (bell) CS»CR (salivation) but in this case similar stimuli to CS (doorbell) also»CR (salivation) without ever being paired with US (meat powder)
response generalization
performing a behaviour that is similar to but not identical to the one that has been previously reinforced
classical extinction
occurs by repeatedly presenting the conditioned stimulus without the unconditioned stimulus
(example - the bell tone is repeatedly presented without the meat)
operant extinction (operant conditioning; behavioural contrast and extinction burst)
stopping the reinforcement of a behaviour that has been previously reinforced
Can lead to extinction burst (temp increase)
Also when 2 behaviours are being reinforced and reinforcement stops for one behaviour. The one that is still reinforced will increase»_space; behavioural contrast
spontaneous recovery (classical, internal inhibition)
during extinction trials, following a rest period, the conditioned response to the conditioned stimulus often briefly reappears
Why? Physiological processes involved in extinction INHIBIT the association (they don’t eliminate them). Pavlov dubbed this internal inhibition*
response/extinction burst (operant conditioning)
during operant extinction, at first, withholding reinforcement will usually result in an increase in the behaviour
stimulus discrimination
an animal learns to discriminate between two similar stimulus because one has been paired with the unconditioned stimulus and the other has not. OPPOSITE of stimulus generalization.
(example - a 500Hz tone vs. 100Hz tone)
If discrimination is too hard/stimuli are too similar»_space; experimental neurosis (due to conflict bw excitatory and inhibitory processes in CNS)
experimental neurosis
if stimulus discriminations are too difficult (500Hz vs. 450Hz), this can result in experimental neurosis
higher order conditioning
a deliberate process in which a conditioned stimulus (CS) is paired with a neutral stimulus that is typically unrelated, until eventually the new stimulus also becomes a conditioned stimulus
(example - pairing meat powder with the bell tone and then a flashing light as well)
*when it involves a second CS= second order, when it involves a 3rd CS= third order
*different than stimulus generalization because it is INTENTIONAL
reinforcement
increases the target behaviour, brings the subject into a more desirable state
punishment
decreases the target behaviour, brings the subject into a less desirable state
positive reinforcement
REWARD
- after the target behaviour is performed, something of value is given to the subject
example: a child is praised after they make their bed
negative reinforcement
RELIEF
- after the target behaviour is performed, something annoying or aversive is removed
example: fastening your seat belt to make the dinging sound stop
positive punishment
PAIN
- after the target behaviour is performed, something aversive is added
example: a child is scolded after they spit
negative punishment
LOSS
- after the target behaviour is performed, something valuable is removed
example: a child loses TV time after swearing
continuous reinforcement
reinforcing every occurrence of the behaviour
intermittent reinforcement
the subject is not reinforced for every occurrence of the behaviour, instead it is scheduled
fixed interval
reinforcement occurs the first time the behaviour is performed after the fixed time interval has elapsed
example: once every 15 minutes
*results in a response rate that is low
variable interval
reinforcement occurs the first time the target behaviour is emitted after a variable, unpredictable interval time has elapsed
example: once at 15 minutes, then at 10 minutes, then at 20 minutes
*results in a response rate that is low-moderate
fixed ratio
reinforcement occurs after a certain unchanging number of responses are emitted
example: after every 10 times the behaviour occurs
*results in a rate of responding that is moderate-high
variable ratio
reinforcement occurs after an unpredictable number of responses are emitted
- another way it can be worded: number of resp’s between reinforcers with the number varying unpredictably from trial to trial. Key word: unpredictable
example: slot machines
- results in the greatest operant strength
superstitious behaviour
- results from accidental reinforcement or non-contingent reinforcement
- reinforcement is applied in an arbitrary and inconsistent fashion that is not linked to the emission of the target behaviour
example: wearing “lucky socks” because one time something lucky happened to you while you were wearing them
pseudoconditioning
- occurs accidentally
- a neutral stimulus that was not deliberately paired with either the unconditioned stimulus or conditioned stimulus comes to elicit the conditioned response
chaining
stringing together specific behaviours to accomplish a goal
shaping
teaching a subject to emit a desired behaviour by providing reinforcement as the person gets closer and closer to the desired behaviour
premack principle
using a behaviour that is freely performed at a high frequency as a reinforcer
thinning
after acquisition of a behaviour, the schedule of reinforcement is best changed from continuous to intermittent or changing FR-10 to FR-20.
Thinning once a behaviour reaches its desired level helps to increase resistance to extinction
fading
a gradual reduction in prompting (cueing the subject about what behaviour to perform)
behavioural contrast
between two behaviours, the one that is being reinforced more frequently will increase and the one that is no longer being reinforced will decrease
reciprocal determination
a person’s behaviour both influences and is influenced by personal factors and the social environment (in other words - an individual’s behaviour may be conditioned through the use of consequences)
satiation
when a primary reinforcer loses it’s reinforcing qualities from being presented too much
example: M&M’s stop being motivators for kids after a while
habituation
when a subject gets used to (or “habituates”) to an unconditioned stimulus and no longer reacts to it
Explains why punishment over time doesn’t work/becomes less effective bc person habituates. So increase must occur for same effect to be had. (Can be dangerous/abusive)
Negative discriminative stimulus (S-delta) vs positive discriminative stimulus (SD)
Negative discrim: signals that a behaviour will NOT be reinforced. Ex: red light means you can’t cross the road
Positive discrim: signals that a behaviour WILL BE reinforced. Ex: green light means you can cross
distinction: one signals no reinforcement (negative) the other signals reinforcement (positive)
When it’s used in the context of operant conditioning: provide clear signals to individuals about the consequences of their actions. Making it more likely that they will engage in behaviours that will lead to reinforcement (SD; positive) and less likely to engage in those that lead to non reinforcement (s-delta)
Explain two factor theory/learning and how it relates to avoidance conditioning
Two factor learning: learning combines two types learning: operant conditioning (learning via consequences) and classical conditioning (learning through associations). It explains how avoidance conditioning works. As the learning process combines classical conditioning with operant conditioning to help individuals avoid aversive consequences.
- Classical conditioning: the aversive stimulus (US) like an electric shock, naturally elicits a fear response. Overtime, a previously neutral stimulus (NS) such as a blinking light becomes a CS that predicts the aversive stimulus (US). So the individual associates CS (light) with impending shock leading to fear (CR)
- Operant conditioning: indiv. Learns to perform a specific behaviour (jumping over a barrier) as a way to escape/avoid the aversive stimulus (shock). This behaviour is negatively reinforced bc it leads to the removal or avoidance of an aversive stimulus. Thus strengthening the avoidance response.
Implosive therapy (what it is and how it relates to classical extinction)
Implosive therapy: used to address anxiety & phobias. Conducted solely through imagination and incorporates psychodynamic elements.
Works by: amplifying anxiety by asking client to exaggerate mental image of feared object or situation.
Classical extinction tie in: the anxiety arousing stimuli (a CS) is repeatedly presented to client in imagination. However it’s presented without the aversive consequences that triggered initial fear response. So, by consistently presenting the stimuli w/o reinforcement the therapy aims to weaken the conditioned fear or anxiety response overtime. (CR)
Thorndike’s research with what led to his development of the law of effect
Cats in a puzzle box
*law of effect: behaviours that follow pleasant consequences are more likely to be repeated than behaviours followed by unpleasant ones
Thorndike’s law of effect vs premack principle
Thorndike’s law of effect focuses on the relationship between behaviour and its consequences. While premack addresses how preferred behaviours can be used to reinforce less preferred behaviours to increase the likelihood of preferred behaviours.
Ex of premack: if kid enjoys playing video games (pref behav.) and dislikes dong chores (less pref) a parent can use the opportunity to play video games as a reward for completing chores.
Ex of law of effect: if a child receives praise for completing chores and finds praise satisfying they are more likely to continue doing chores in the future
Overshadowing vs blocking
Overshadowing: occurs when one cue becomes more prominent and distracts from other cues making it harder to learn about the less prominent one (aka- learning is harder but it’s still happening)
Ex: teaching a dog to sit while waving a bright flashy toy in front of them. The dog might be more focused on the toy so the sit command is not learned. (Think Overshadowed by something flashier but can still learn)
Blocking: occurs when one cue has already been associated with an outcome and then another cue is introduced but it doesn’t add new information bc it doesn’t change what’s expected.
Ex: if dog has already learned the sound of the bell means food is coming introducing a whistle doesn’t add new information. As the dog might continue to associate the bell with food. (Think block so there is no change)
Premack principle vs differential reinforcement
Premack is about using a high frequency behaviour to reinforce a low frequency behaviour, while differential reinforcement is about reinforcing specific behaviours while not reinforcing others, without the requirement of using a high frequency behaviour as a reinforcer.
Ex for DRA: reinforce child’s polite request with praise while not reinforcing whining.
Ex for premack: a parent tells kid they can play video games after completing HW
Least susceptible reinforcer to satiation?
Generalized reinforcer ($, tokens)
Bc they can be exchanged for a variety of back up reinforcers.
Wolpe, reciprocal inhibition & systematic desertization
Wolpe’s use of Reciprocal inhibition reduced experimental neurosis in cats led to his development of systemic desensitization to treat phobic anxiety in humans.
Reciprocal inhibition: anxiety/fear can be reduced by replacing it with a relaxation response
Systematic desensitization: uses reciprocal inhibition to help people learn how to stay relaxed in stressful situations
Internal inhibition & Pavlov
II= even if a stimulus used to trigger a response sometimes the response won’t happen. Because of the internal process behind the scenes that can stop the conditioned response
Ex: a dog salivates to a bell because it associates it with food. But sometimes, even when the bell rings the dogs internal processes may prevent salivation.
Premack principle vs response cost
Premack: encourage a less desirable behaviour to happen by offering a more desirable one as a reward. Ex: allow kid to play video games after they finish HW
response cost: you discourage an unwanted behaviour by taking something enjoyable away. Ex: take away video game when they misbehave
Serial position effect & primacy & recency effect
Ability to recall first and last items on list is due to the transfer of words from short term to long term memory’s. Primacy= long term. Recency= short term
Keyword method is most useful for what?
Method involves creating an image that links two words or links a word and it’s definition. It’s particularly useful for learning a second language
Matching law vs premack principle
Matching law: how people divide their time between different activities based on the reward each activity offers. They’ll do what brings better rewards
Premack: used a high frequency behaviour to reinforce a lower freq behaviour
Covert sensitization
A type of aversive counterconditioning conducted in imagination
Without rehearsal, info remains in short term memory for how long?
20-30 seconds
Watson used what conditioning procedure to establish lil albert’s fear response to rats?
Delay conditioning
Why would the use of operant extinction not be a good punishment for violent/dangerous misbehaviours?
Bc of extinction burst. It may cause an increase in the behaviour before it decreases.
Systematic desentization is to ___ as exposure therapy is to___?
Counterconditioning; extinction
Kohler’s research with chimpanzee’s demonstrated what?
Insight learning (AHA experience)
What maximizes the use of cue exposure treatment for alcohol use disorder?
CET: exposed person to cues associated with drinking.
It’s maximized with coping skills training
Response generalization vs stimulus generalization
Response generalization: the behaviour changes or generalized to other behaviours with the same goal
Ex= child rewarded for saying “please” now also says “thank you” when requesting something
Stimulus generalization: response stays the same but it extends to other stimuli
Ex= lil Albert fearing not only a white rat but also other similar objects/animals with white furry characteristics
Latent inhibition vs internal inhibition
II= process within person that inhibits (reduces) a conditioned response as it happens/after.
Latent inhibition: occurs when PRIOR EXPOSURE* to a stimulus makes it harder to form new associations with that stimulus later on. Like your brain becomes immune to paying attention to something you’re seen before. (So.. neutral stimulus (bell) can’t become a CS now)
Differential Reinforcement (DRA, DRI, DRO, DRL)
Differential reinforcement: combined extinction and positive reinforcement to weaken undesirable behaviour and strengthen a more desirable/alt. Behaviour.
DRI: differential reinforcement of incompatible behaviour. Reinforces a specific desirable behaviour that is incompatible with the undesirable behaviour.
- Example: teacher reinforced child for staying in seat for each 30 minute interval to reduce frequent seat leaving.
DRA: differential reinforcement of alternative behaviour. Reinforces 1 or + alt. Behaviours instead of undesirable one.
- example: teacher reinforces student for raising hand instead of shouting to reduce disruptive behaviour in class.
DRO: differential reinforcement of other behaviour. Reinforces any behaviour other than undesirable one.
- Example: parent ignores hand-flapping and reinforces kid with ASD for engaging in appropriate activities during 30 min intervals.
DRL: differential reinforcement of low rates of behaviour. Reinforces engaging in target behaviour at or below specific rate.
- Example: reinforces child for asking 3 or fewer q’s during every 60 min. Interval to reduce disruptions.
Fading vs thinning
Fading: gradual removal of visual oe auditory hints or cues (prompts)
Thinning: reduction of reinforcers (I.e. cut something good like cake)
Positive practice is most similar to what?
Positive practice is a component of overcorrection. It involves having individual practice appropriate behaviours that are alternatives to their inappropriate behaviours.
It is most similar to habit reversal training** = individual practices an alternative, usually incompatible behaviour.
Latent inhibition is due to what?
CS pre-exposure
Trace conditioning (classical conditioning)
CS is presented and finishes BEFORE US is presented.
Ex: bell rings and stops ringing and then meat powder is presented
Simultaneous conditioning
CS & US are presented and finish at the same time.
When meat powder appears bell rings at the exact same time.
Delay conditioning
CS overlaps comes slightly before presentation of US
(Bell rings >1/2second later>meat powder)
*most effective
*optimal when it’s a 1/2 second delay.
Compound conditioning (classical; blocking, overshadowed)
Occurs when 2 or more stimuli are presented together —> blocking and overshadowing.
Blocking= pairing occurs between (bell)CS & (meat powder)US» CR(salivation). Then a neutral stimulus (light) is presented with CS (bell). Here, the CS blocks the neutral stimulus from becoming another CS and triggering CR.
Overshadowing: 2 neutral stimuli are presented together from the START(before coming CS). After repeated pairings only the more salient becomes CS. Why? Because less salient becomes overshadowed by the more salient stimulus.
Types of positive reinforcers (operant conditioning, primary, secondary and generalized reinforcers)
Primary: inherently reinforcing because they satisfy needs related to basic survival (water, food)
Secondary: neutral stimuli becomes reinforcing bc of association with primary reinforcer/experiences (praise, bell, tokens)
Generalized: when secondary reinforcers are associated with a variety of back up primary reinforcers ($, bc it can be exchanged for a variety of backup reinforcers)
Stimulus control (operant conditioning; two factor learning)
Behaviour occurs in the presence of one stimulus but not in presence of another stimulus. Occurs as a result of two favor learning = 1) performance is due to positive reinforcement (operant conditioning) 2) performance in presence of positive discriminative stimulus but not negative discriminative stimulus is the result of discrimination training (classical conditioning)
Prompts (operant conditioning; fading)
Cues that help initiate performance of behaviour and include providing cues/instructions.
Gradual removal of prompts once desired level is reached is known as fading.
Stimulus generalization (operant conditioning)
Same as in classical conditioning
Response generalization (operant conditioning)
Reinforcing a specific behaviour not only increases that behaviour but likelihood that similar ones will occur
Ex. Praised for sharing with sibling now sharing with other children
Escape and avoidance conditioning (operant conditioning, two factor learning)
Both applications of negative reinforcement.
Escape= behaviour occurs bc it allows the individual to avoid an unpleasant stimulus. (Seatbelt on to escape beaping)
Avoidance= results from two factor learning. 1)anticipates shock with blinking light (positive discriminative stimulus)» classical conditioning 2) jumps over barrier to avoid shock on other side (- reinforcement)» operant conditioning
Classical Interventions that use extinction (ERP, cue, implosive, EMDR)
1) ERP = anxiety bc stimulus has been conditioned to cause anxiety and avoidance doesn’t allow client to process. So In ERP client= exposed to CS (elevators) + client not allowed to avoid = extinction + new CR
2) cue exposure therapy = expose clients to cues (CS’s) associated with substance while prohibiting them from using. Maximized when + with coping strategies (ie. remember negative consequences of smoking)
3) implosive therapy = imaginal exposure + psychodynamic elements
4) EMDR = Shapiro believes it’s got something to do with way traumatic memories are processed in brain (maladaptive or not at all). EMDR helps with adaptive processing of trauma memories.
Classical interventions that use counterconditioning (systematic, aversion)
1) systemic desensitization= PMR + exposure (graded with hierarchy) @ the same time. Research shows: Effective due to classical extinction. CS (anxiety arousing stimuli) + US (PMR) = CR (relaxation vs fear)
2) aversion: allow known as aversive counterconditioning. In it, stimuli associated with problematic behaviour is paired with US that naturally produces unpleasant response = fetish patient : present fetish object (CS) + electric shock (US) = pain (CR). End therapy with relief scene (client imagines facing stimulus associated with problem with behaviour but refrains from engaging in behaviour)
Interventions based on operant conditioning/positive reinforcement (shaping, chaining & premack principle)
1) shaping
2) chaining (forward chaining & backward chaining)
3) premack principle
Interventions based on operant conditioning/punishment interventions
1) overcorrection
2) response cost
3) timeout
4) extinction