Week 2 (learning & cognition 1) Flashcards
classical conditioning, operant conditioning, reinforcement and punishment, antecedent and discriminant stimuli
Pavlov quote about classical conditioning
“The normal animal must respond not only to stimuli which themselves bring immediate benefit or harm, but also to those that only signal the approach of these stimuli; though it is not the sight and sound of the beast of prey which is in itself harmful…..but its teeth and claws.” -Ivan Pavlov
Classical conditioning:
Essentially the brain learns and associates stimuli to predict the likelihood of events. This Phenomenon is so powerful across species because fundamental process of human nature
Learning
The set of biological, cognitive and social processes through which organisms make meaning from their experiences, producing long-lasting changes in their behaviour, abilities, and knowledge.
* Learning helps us to predict the future from our past experiences and use these predictions to guide adaptive behaviours.
Stimuli
- Biologically significant stimuli relate to survival:
3 Reasons why Stimuli Relate to Survival
- Stimuli that naturally cause either defensive (fight, flight, freeze) or appetitive (approach) reflex responses.
- That is, stimuli that are naturally punishing (aversive) or rewarding (appetitive)
- In the language of conditioning, these are called “unconditioned stimuli’ or “reinforcers”.
2 Facets of Non-associative learning:
- Sensitisation
- Habituation
Sensitisation
the temporary state of heightened attention and responsivity that accompanies sudden and surprising events. The learner remains alert to potentially threatening stimuli in the environment and has an increased response to subsequent stimuli.
Habituation
the gradual diminishing of attention and responsivity that occurs when a stimulus persists.
Explain how an octopus uses the 2 facets of non-associative learning
octopus comes with reflex mechanism and is hard wire do inking response when it feels nervous. This behaviour is un-learned behaviour, S and H.
S: its nervous system is sensitised to react to possibly “scary” stimuli from a loud noise.
H: its nervous system learns to dampen the stimulation if the noise continues
Associative Learning
learning associations (relationships) between stimuli, and/or between stimuli and behavioural responses.
Is conditioning non-associative or associative learning?
Associative: causation involves a casual structure of the environment: if X then Y
CLASSICAL CONDITIONING DEFINITION
Learning a predictive relationship between an originally neutral environmental event and a biologically significant event that naturally causes a reflex response, so that the previously neutral event becomes a meaningful stimulus that produces the reflex response on its own.
Explanation of Classical Conditioning
In other words, a classically conditioned response is a learned reflex response to a stimulus that would not usually cause it.
Pavlov (1897)
Pavlov was interested in finding our what a reflect response, because dogs salivated before food is presented learning became conditioned
* Pavlov began to control the stimuli that occurred before the food * Used the ‘bell’ sound of a metronome as his neutral stimulus * Presented the sound immediately before he presented the food. * The food naturally causes a reflex salivation response * Pavlov wanted to see if the bell could could come to cause salivation on its own through repeated association with food during learning. Described three phases of this process.
Three Phases of Classical Conditioning
- The conditions that exitsts before conditioning (prior learning)
- During conditioning (learning associations)
- After conditioning
- Prior learning (Three Phases of Classical Conditioning)
a) The innate reflect responses of the learner that occur to stimuli that are naturally rewarding (appetitive) or punishing (aversive or threatening)
b) the neutrality of stimuli that have not been associated with appetitive or aversive stimuli
- Learning Associations (Three Phases of Classical Conditioning)
Experiencing a predictive relationship between a neutral stimulus and a biologically relevant stimulus
- After Conditioning (Three Phases of Classical Conditioning)
The previously neutral stimulus becomes able to produce a learned reflex response in preparation/expectancy of a biologically relevant stimulus
UCS
Unconditioned Stimulus (e.g. dog food)
Always produces an associated response reflex
UCR
Unconditioned Response (e.g. dog drools)
US + UCR =
Reflection
NS
Initially Neutral Stimulus (e.g. bell before conditioning)
NS + UCS + UCR =
Working on building a conditioned response
CS
Conditioned Stimuli (the bell after training)
CR
Conditioned Response (now dog drools at bell)
CLASSICAL CONDITIONING DEFINITION (in phase terms)
Learns a predictive relationship between an originally NS and an UCS and its CRm si that previously NS becomes a CS that can cause a CR
Stimulus Generalisation
Pavlov demonstrated that the classically conditioned salivation response would generalise (transfer) to other similar stimuli. (e.g. any bell would cause drooling)
Stimulus Discrimination
To isolate a conditioned response so that it is not generalised (e.g. only call bell not bicycle bell)
Extinction
Being able to extinguish a conditioned response
this is the learned inhibition of the CS-UCS association
Rapid Reacquisition
Bringing back a conditioned response after a sustained extinction
The return of a conditioned response after a period of extinction when the conditioned stimulus is presented on its own.
Relationship between Extinction and Rapid Reacquisition
This explains that extinction does not extinguish behaviour in all cases because it is hardwired into nervous systems. If the extinction learning occurs in a specific context the reflex may on last in that situation
Watson (1919)
Watson viewed Pavlov’s study as the epitome of objective science and believed humans are conditioned solely from our environment (NURTURE side)
Watson quote about classical conditioning
“Give me a dozen healthy infants… and my own specified world to bring them up in and I’ll guarantee to take any one at random and train him (sic) to become any type of specialist I might select – doctor, lawyer, artist….even beggar man and thief, regardless of his talents, tendencies, abilities, vocations and race of his ancestors”. – J.B. Watson, 1919.
Operant Conditioning
Behaviour is shaped by the learner’s history of experiencing rewards and punishments for their actions.
Skinner quote on operant conditioning
“Behaviour operates on the environment to generate consequences.” - B.F. Skinner (1904-1990)
* Burrhus Frederic Skinner who took the School of Behaviorism beyond classical conditioning by investigating the processes by which voluntary behaviours are shaped by their consequences.
* Skinner also focussed more on reinforcement rather than punishment
The Skinner Box
Redmond draws a parallel between the lever in the skinner box to the reinforcement from gambling
* Skinner developed the Skinner Box, an experimental apparatus that can control animals within a laboratory setting. Basically a small box with a level that a rat can eventually realises it can use. * Each lever pressed was connected to a computer, exposed stimuli also controlled by computer * Rats were kept hungry eating 2/3 of required daily calories * Pressing the lever was the target behavior, which could be strengthened through reinforcement and weakened through punishment. * Reinforcement = food
SKINNER’S DEFINITION OF REINFORCEMENT
A behavior is reinforced (strengthened) whenever a desirable outcome is the consequence.
* Behaviours that are reinforced are more likely to be repeated.
* A reinforcer is any consequence of a behaviour that makes that behaviour more likely to recur in future.
* Reinforcers can be either positive (+) or negative (-).
Positive Reinforcement
An animal will learn to reproduce a behaviour if the consequence is receiving something pleasant.
Negative Reinforcement
An animal will learn to reproduce a behavior if the consequence is that something unpleasant will stop.
Positive Reinforcer
something pleasant that is added to increase behavior
Negative Reinforcer
something unpleasant that is removed to increase behaviour
Explain Continuous Reinforcement
- Continuous reinforcement rarely occurs in natural environment
- Behaviour is usually reinforced on a partial “schedule”
- Partial reinforcement leads to more persistent learning because the learner becomes accustomed to reinforcement occurring on some occasions and not others
- Continuous reinforcement leads to rapid extinction once the reinforcer is withheld.
Explain the Extinction of Reinforced Behaviour
- Extinction of an operantly conditioned behaviour occurs when reinforcement is withheld.
- Not immediate - sometimes there is a brief increase in responding referred to as an extinction burst followed by decrease in trained behaviour.
- Th figure shows that responses that are reinforced partially will be harder to extinguish than those reinforced continuously
Discuss Skinner’s progression to Successive Approximations
- Shaping reinforces successive approximations to the desired behaviour (reinforcing small steps).
- Start by reinforcing a high frequency component of the desired response.
- Then drop this reinforcement – behaviour becomes more variable again.
- Await a response that is still closer to the desired response – then reintroduce the reinforcer.
- Keep cycling through as closer and closer approximations to the desired behaviour are achieved.
- Enables the molding of a response that is not normally part of an animal’s repertoire
- Skinner was able to teach pigeons how to play ping pong for rewards (he did many animal experiments)
Punishment
A behavior is punished (weakened) whenever the learner experiences an undesirable consequence for that behaviour.
* Behaviours that are followed by punishment are less likely to be repeated.
Punisher
A punisher is any consequence of a behavior that makes that behaviour less likely to recur in future
* Punishers can also be either positive (+) or negative (-).
Positive Punishment
An animal will stop producing a behaviour if the consequence is the presentation of an unpleasant stimulus.
Negative Punishment (response cost)
An animal will stop producing a behaviour if the consequence is the presentation of an unpleasant stimulus.
Positive Punisher
An animal will stop producing a behaviour if the consequence is the presentation of an unpleasant stimulus.
Negative Punisher
A pleasant stimulus that weakens behaviour when removed as a consequence of the behaviour
When is Punishment Effective (CCC)
- Contigency
- Contiguity
- Consistency
Contigency (CCC)
the relationship between the behaviour and the punisher must be clear
Contiguity (CCC)
the punisher must follow the behaviour swiftly
Consistency (CCC)
the punisher needs occur for every occurrence of the behaviour
Explain the Drawbacks of Punishment
- Positive punishment rarely works for long-term behaviour change.
* It tends to only suppress behaviour- It does not teach a more desirable behaviour.
- If the threat of punishment is removed, the behaviour returns.
- Produces negative feelings in the learner, which do not promote new learning.
- Harsh punishment may teach the learner to use such behaviour towards others (social learning).
Explain the Alternatives to Punishment
- Stop reinforcing the problem behaviour (extinction).
- Reinforce an alternative behaviour that is both constructive and incompatible with the undesirable behaviour.
- Reinforce the non-occurrence of the undesirable behaviour.
Antecedents
a stimulus that cues an organism to perform a learned behavior.
Explain Antecedents
- Stimuli in the environment can become antecedents for operantly conditioned behaviours
* An antecedent is a ‘cue’ that signals the availability of a reinforcer.
* Note that the antecedent-reinforcer relationship is based on a classically conditioned association
* Classically conditioned associations become cues for operant behaviours.
An example of Antecedents with mobile phones
- For example, the sight of my mobile-phone is associated with the rewarding consequences of scrolling through social media
* The phone becomes a cue (antecedent) for the voluntary behaviour of scrolling social media and its attendant rewards.
*this can lead to habits and addictions
The ABC model of Operant Conditioning
Antecedent → Behaviour → Consequence
Discriminant Stimuli
the antecedent stimulus that has stimulus control over behavior because the behavior was reliably reinforced in the presence of that stimulus in the past. Discriminative stimuli set the occasion for behaviors that have been reinforced in their presence in the past.
Skinner’s work with Discriminant Stimuli
- Skinner taught pigeons to turn circles counter-clockwise to receive a reward when in one box, and clockwise to receive a reward in another box
- The pigeons learned that each box provided a distinct discriminant stimulus for each behaviour.
Particial Reinforcement
Tens to last longer than continously reinforced behaviour