learning Flashcards
associative vs non-associative learning
Associative Learning: any learning process in which a new response become associated with a particular stimulus
* Classical
* Operant
Non- associative Learning: Learning that results in a change in the frequency or amplitude of behaviour/ response after repeated exposures to a single stimulus
* Sensitisation
* Habituation
Associative learning stages
Acquisition Phase:
* During the CS-US pairings lead to increased learning. As a result, the CS can produce the CR.
Extinction:
* If the CS is presented without the US, eventually the CR is eliminated. This process is called extinction.
Spontaneous:
* Later, if the CS is presented alone, it will produce a weak CR, known as spontaneous recover
Non Response:
* This CR gets weaker with every spontaneous recovery and will eventually be extinguished if the CS is continually presented without the US.
Stimulus Substitution Theory:
- Pavlov thought that the CS became as substitute for the US
- Innate US-UR reflex pathway
- CS substitutes for the US in evoking the same response
- CR and UR produce by the same neural region
jenkins and moore pigenon respone to light and different stimulus
Sign tracking vs Goal tracking
- Sign trackers direct their behaviour at the CS even at the expense of the US.
- only sign trackers ascribe incentives to the cue (CS)
- Goal trackers behaviour is directed to the US .
Preparatory Response Theory
- Kimble’s (1961,1967) theory proposed that the CR is a response that serves to prepare the organism for the upcoming US
generalisation and dicrimination
- Stimuli Generalisation A tendency to respond to stimuli that are similar, but not identical, a conditioned stimulus
- Transfer of training: Being able to apply knowledge gained in one situation to that of a similar one
- Stimulus Discrimination: The learned ability to respond differently to similar stimuli – pigeon picking the better painting
high order conditioning
(cs-cu > cs -cs)
High – Order condition:
* CS leading to CR without presenting the US
High – order or second-order conditioning refers to conditioned responses that involve neutral stimuli (stimuli that are not directly threatening or rewarding)
Limitation of High-order Conditioning
The CR to CS is weaker than to CS-2
* 50% as strong as initial response
Sensory Preconditioning
(cs-cs > cs -us)
In second-order conditioning, one CS is capable of becoming associated with another CS.
Overshadowing
- Condition with a compound CS, i.e simultaneously present
- Test elements of compound CS: Salient elements ‘bright Light” overshadows less salient metronome. This leads to no conditioned response to the metronome
Blocking
If an existing, learned association appears to provide all the information needed to predict the occurrence of the US, this existing association will serve to block the learner from developing a new association.
- From the learners perspective, there is no new information about when US will appear, so no need to learn anything.
Latent Inhibition
- The dog has experience with the stimuli (e.g. with a metronome)
- Familiar stimuli are more difficult than are novel stimuli
Temporal Relationships Between Pairings
Trace interval: time between cs and us
long delay;
* Onset of CS precedes US by at least serval seconds
* CS continues until US is presents
* Pavlov’s dogs: when they got use to the 10 second stimuli their response would only happen in the last 8-9 seconds.
Backward Conditioning
(CS presented after US)
* Level of conditioning markedly lower
* Order is important Predictive principle: The onset of CS signals a time in which the US will be absent
spontaneous recovery
after a period of time has passed the CS with create a weak response.
The Rescorla- Wagner Model of Classical Conditioning
Contingency: refers to the predictability of the occurrence of one stimulus from the presence of another
- Increasing the delay between the CS and US results in the CS becoming less useful as a predictor of the US
- Contingency is the probabilistic relationship with the US given that a CS has occurred
The probability (p) of a US
The probability (p) of
occurring given that (/) a CS is presented
* is greater than (>)*
the probabiltity of
a US given that (/) NO CS is present
Contingency theory –
conditioned response develops when the conditioned stimulus is able to predict the occurrence of the unconditioned stimulus
Negative Contingency Between CS and US
If the CS reliably predicts the absence of the US, then the CS and US are negatively correlated
Blocking: Rescorla-Wagner
When two CSs are presented, the subject’s expectation is based on the total expectation of both the CSs. When they are not presented together the response is not the same strength.
Shaping or successive approximates:
Specifically, when using a shaping technique, each approximate desired behaviour that is demonstrated is reinforced, while behaviours that are not approximation of the desired behaviour are not reinforced.
Fixed ratio:
- Behaviour reinforced (100/1, 50/1)
- Response Rate (higher ratio = faster responding)
- Resistant to extinction: Low
Behaviour Patterns in Fixed Ratio Schedules - Post reinforcement Pause: Brief pause before recommencing the behaviour
- Ratio run: When the response increases closer to the reward
- Goal gradient: as you get closer to the reward the task feels easier and increases response through knowing the reward is near.
variable ration
- Behaviour/ reinforcement:
Random/unpredictable number of responses between reinforcements - Response rate: Fast
- Behaviour: Work hard and at steady rate
- Resistance to extinction: High
This is because it’s harder to decern if there is still a chance of getting a reward
Fixed interval Schedule
Variable- interval Schedule
- Response rate: Slow
- Behaviour: Work at a steady rate
- Resistance to extinction: High
Fixed interval Schedule
- Response: Scalloped
- Behaviour: high before reinforcement/ long pause after
- Lowest rate of responding
- Extinction resistances: Low
Differential Reinforcement
a behavior modification technique that involves rewarding desired behaviors while withholding reinforcement for undesired behaviors
- Differential Reinforcement of Other behaviour (DRO)
* The subject periodically receives the positive reinforcer provided it is engaging in other behaviours - Differential Reinforcement of Low rates of responding (DRL)
- Differential Reinforcement of Incompatible behaviour (DRI)
- Differential Reinforcement of Alternative behaviour (DRA) not necessarily incompatible
Establishing Operations (EO’s)
Reinforcer Magnitude
* Reward magnitude is often a matter of ‘being in the eyes of the beholder’
Contrast effect
* Shifting the value of the reward in ‘mind-stream’ is effective in changing behaviour
Gradient of delay: the delay decreases the contiguity between response and outcome
Delay of reinforcement: The delay decreases the contiguity between response and outcome
Response- Reinforcer Contingency
- The greater the consistency between the reinforcer and the response, the quicker/ more effective the conditioning.
Primary & Secondary
Primary reinforcer: stimulus that is reinforcing even without pervious training. Primary reinforcers are biologically relevant stimuli or events
A conditioned (secondary) reinforcer is an arbitrary event (such as a tone clicker or token) that increases the frequency of an operant response
Premack’s theory
This generalization, known as the Premack Principle, is usually stated somewhat more simply: High probability behaviour reinforces low probability behaviour
Two Main Chaining Techniques
Forward Chaining: Using forward chaining, the behaviour is taught in its naturally occurring order
Each step of the sequence is taught and reinforced when completed correctly. Once 1st is mastered > next step
Backward chaining: Learner first performs final behaviour in the sequence at the predetermined criterion level, reinforcement is delivered.
Response Chain Consideration
- If a link breaks, all behaviours prior to the broken link with be extinguished
- Each reinforcer does not have equal value
Extinction in Operant Condition
is new inhibtory learning
recovery after extinction
- Reinstatement
- Reinstatement occurs when the conditioned response returns after the unconditioned stimulus is present again, even without the conditioned stimulus.
- Renewal
Occurs when the CR returns after extinction when the context changes
The person successfully complete therapy in a clinic no longer fear dogs. Encounters dog in a different context their fear response returns.
spontaneouse recovery:
- is the reappearance of the CR after a period of time has passed since extinction
- Rapid Reacquisition
- Rapid reacquisition is the faster releasing of the conditioned response after extinction when the conditioned stimulus and unconditioned stimulus are paired again.
- Resurgence
- Resurgence occurs when a previously extinguished behaviour reappears after a different behaviour that was reinforce during extinction is also extinguished
- Rats where given a choice after conditioning; rats tended to go with the better reward than the conditioned reward
- However, a delay changes the effect to the same level for each choice
Extinction by punishing the response
Extinction can occur through changing the response to a punishment, creating inhibitory learning.
Extinction by contingence degradation:
- If the CS reliably predicts the absence of the US, then the CS and US are negatively correlated
Escape and Avoidance and Punishment
Escape: getting away from an aversive stimulus in progress
* Escape behaviour results in the termination of an aversive stimulus.
Avoidance: behaviour occurs before aversive stimulus preventing the delivery of it
* Negative contingency between response and aversive stimulus
* Results in an increase in operant responding (behaviour) that is maintained by negative reinforcement
Neurosis, Learned Helplessness
When uncontrollable event lead to a perceived lack of control which triggers generalized helpless behaviour.
- Passive Avoidance
- Learning not to make response in order to avoid the event, e.g. staying quiet to avoid conflict.
- OCD typically involves an active avoidance response
- Phobic behaviour typically involves a passive avoidance response
Secondary negative reinforcement of avoidance
- Not getting “punished” or “injured”, is rewarding only if punishment is expected, i.e. only if the subject is anxious or fearful, and if this expectation in some ways gets reduced!
Mowers two factor theory of avoidance
- First the subject learns to associate the warning stimulus with the aversive stimulus.
- Now the subject can be negatively reinforced during the warning stimulus; this is the second, operant conditioning process
Two factor theory and extinction of response
Every successful avoidance puts CS on extinction With extinction, fear drops, so motivation to avoid decreases Resulting in more shocks, strengthening CR again and increasing avoidance response
Sidman Free-Operant Avoidance
Avoidance can be learned without a warning CS
- Shocks at random intervals Response gives safe time
- Extensive training, but rats learn avoidance (errors, high variability across subjects)
- This is explained by one factor theory
One-factory theory
Avoidance is negatively reinforced by the lower rate of aversive stimulation to which it is associated
Rescorla and LoLordo (1965)
CS+ was presented the rate of jumping doubled
When the CS- was presented the rate of responding fell to almost zero
CS+ can amplify avoidance
CS- can reduce avoidance
cognitive theory of avoidance
Cognitivists believe avoidance responding is based not on fear but on the subjects’ expectation that a response will avoid shock:
When the animal eventually jumps over the barrier to avoid shock, a new expectation forms (shock does not occur if the response is made)
cogntive theory and the absence of extinction
The theory says that avoidance depends on two expectations:
- in the absence of a response shock will occur; BUT
- if the response is made shock will not occur
Early in extinction, the dog holds both of these expectations and therefore responds.
With each new trial this expectation receives further confirmation, so the jump should be strengthened.
Types of negative Punishment
- Timeout
* Loss of access to positive reinforcers following problem behaviour - Response cost – aka Omission Training
* Removal of reinforcer for inappropriate behaviour
Intrinsic vs extrinsic punishment
Intrinsic punishment – The behaviour being performed is inherently punishing (e.g., less likely to lift a heavy object if you experience pain last time object was lifted)
Extrinsic punishment – The event that follows the behaviour is punishing (e.g., being chastised after posting inappropriate message on a discussion board)
Lang and Melamed (1969
published a study in which they used punishment to stop psychogenic vomiting and rumination in a 9-month-old boy who was hospitalised for frequent vomiting
Punishment: Conditioned Suppression Account
Punishment does not directly weaken a behaviour, but instead produces an emotional response that interferes with the occurrence of the behaviour.
Avoidance Account of Punishment
Punishment actually involves avoidance learning in which the avoidance response consists of any behaviour other than the behaviour being punished