Apply learning theory Flashcards
Ivan Pavlov’s experiment
Classical conditioning:
Unconditioned stimulus (meat) elicits unconditioned response (salivation) normally a reflex and response to autonomic nervous system.
Neutral stimulus (bell) elicits no response at first.
Conditioning stage of experiment repeatedly pairs unconditioned stimulus with neutral stimulus to elicit unconditioned response.
After doing it enough, the neutral stimulus becomes a conditioned stimulus (bell) that elicits a conditioned response (salivation) on its own.
Little Albert experiment
Phobia is conditioned in a toddler by pairing a noise (unconditioned stimulus) with a rat (neutral stimulus) to elicit fear and crying (unconditioned response, conditioned after phobia developed).
The response also occurred when presented with other small white fluffy objects, demonstrating generalisation.
Classical conditioning with advertising
Advertisements also condition positive emotions by pairing products with stimuli that elicit positive emotions (pleasant environment, beautiful people, good times).
Processes in classical conditioning
Reinforcement:
* Temporal or spatial pairing of the 2 stimuli (CS & UCS).
Acquisition:
* “Initial stage of learning something”
* Usually several pairings of the CS and UCS are needed before the CR is fully developed.
* The first series of CS-UCS pairings, and gradual appearance and strengthening of the CR occur
during the acquisition phase of the experiment.
* Proceeds more quickly if the intensity of the UCS increases
Extinction:
* This procedure produces a reduction and eventual disappearance of the CR.
* It involves repeatedly presenting the CS without the UCS
Spontaneous Recovery:
* The “reappearance of an extinguished response after a period of nonexposure” to the CS
* Extinction does not simply “erase” the previous learning, or permanently “destroy” the CS—UCS pairing.
Generalisation:
* After classical conditioning with a CS, similar stimuli will also elicit CRs, even though they have never been paired with the UCS
* The most similar stimuli will elicit the most CRs
Discrimination:
* This is the opposite of generalisation. That is, the subject learns to respond to one stimulus and
not to a similar stimulus
Edward Thorndike’s experiment
Investigate how voluntary (not reflex) behaviours can be modified by experience (learning). Two types of experimental apparatus he used were: the puzzle box (from which cats had to escape) and mazes.
Thorndike formulated his Law of Effect from the puzzle-box results: Behaviour resulting in pleasant consequences is likely to be repeated in the same situation.
Difference between classical and operant conditioning
Classical: focuses on reflex or involuntary behaviours elicited by stimuli that precede the response.
Operant: focuses on voluntary behaviours emitted by the organism operating on the environment, not just reacting.
Three-term contingency
Reinforcement/ Punishment stimulus: Reinforcers and punishers are the consequences of behaviour and come to affect the subsequent frequency of behaviour.
Operant response: A behavior that operates on its environment
Discriminative stimulus: Although an operant response is controlled by its consequences, stimuli that precede a response can also influence operant behaviour. If behaviour is consistently followed by a reinforcer, in the presence of a particular stimulus, then that stimulus can act as a ‘signal’.
Reinforcement and punishment
Positive reinforcement: deliver pleasant stimulus to increase response rate
Positive punishment: deliver unpleasant stimulus to decrease response rate
Negative reinforcement: remove unpleasant stimulus to increase response rate
Negative punishment: remove pleasant stimulus to decrease response rate
Punishment suppresses unwanted behaviour without strengthening desirable behaviour.
Processes in operant conditioning
(a) Acquisition and Shaping
Acquisition is the initial stage of learning a new pattern of responding. It is usually a gradual process.
Shaping is needed if the organism does not, on its own, emit the desired response.
(b) Extinction
This procedure involves no longer following the operant response (bar-press) with a reinforcer (food pellet). This results in the gradual weakening and disappearance of the response. Often the number of responses increases initially, and then gradually decreases.
(c) Resistance to extinction
This occurs if the organism continues to make responses after reinforcement has been stopped.
(d) Spontaneous recovery
After a session of extinction, and time away from the Skinner box, the ‘extinguished’ bar-pressing response may reappear.
(e) Generalisation
This process refers to responses being made in the presence of stimuli that are similar to the original discriminative stimulus used in conditioning.
(f) Discrimination
This is the opposite of generalisation. It involves an organism’s lack of response to stimuli, which are similar to the original discriminative stimulus used in conditioning.
(g) Delayed reinforcement
A favourable or positive outcome is more likely to strengthen a response if it immediately follows the response. Conditioning proceeds slowly if there is a delay between a response and the delivery of the reinforcer.
(h) Conditioned reinforcement
Through repeated pairings with a primary reinforcer (unconditioned), a secondary reinforcer (conditioned) can also act as a reinforcer.
Primary reinforcers satisfy biological needs (e.g., food for a hungry organism, or water for a thirsty one). Secondary reinforcers depend on learning- for humans include money, tokens, material possessions etc.
Schedules of reinforcement
A continuous reinforcement schedule is one in which every response is reinforced, whenever it occurs. If reinforcement is not continuous, then the schedule of reinforcement is intermittent.
(a) FIXED—RATIO (FR)
On this schedule, the reinforcer is given after a fixed number of non-reinforced responses. Every nth response is reinforced. High/rapid rate of response until reinforcement occurs which is then followed by a relatively long post-reinforcement pause
(b) VARIABLE—RATIO (VR)
On this schedule, the reinforcer is given after a variable number of non-reinforced responses. On average, every nth response is reinforced, but the exact number of responses needed for reinforcement varies from one reinforcement to the next. Like FR schedule, it generates a high/rapid rate of response, but regular pausing is uncommon (no typical post-reinforcement pauses). Very hard to extinguish behaviour on this schedule (e.g. poker machines).
(c) FIXED—INTERVAL (FI)
This schedule reinforces the first response that occurs after a fixed period of time has elapsed. Cumulative record shows a typical ‘scalloped’ pattern of responding, with a post-reinforcement pause. Rate of response is lower than ratio schedules, except near the end of the interval (as the reinforcer approaches), where it accelerates.
(d) VARIABLE—INTERVAL (VI)
This schedule reinforces the first response that occurs after a variable period of time has elapsed, since the previous reinforcer. The interval length varies around a predetermined average. Like the VR schedule, consistent post-reinforcement pauses of any length are rare. Response rates are moderate to low (depending on length of mean interval).
Behaviour modification
The technique is based on operant conditioning. Behaviour that is positively reinforced is likely repeated and behaviour that is ignored is likely to be extinguished,
Shaping by successive approximations
- Specify the target or goal ‘desired’ behaviour
- Identify a response to use as a starting point in working towards the goal behaviour
- Reinforce starting response, then require successively closer approximations, until the desired response eventually occurs
Unfortunately, harmful or ‘undesirable’ behaviours can also be shaped.
Self-improvement program
Step 1: Specify target behaviour
Step 2: Gather baseline data (initial rate of target response, identify possible controlling antecedents and consequences)
Step 3: Design program (select strategies to increase/decrease response strength)
Step 4: Execute and evaluate program
Step 5: Phase out or end program
Biofeedback training
This technique draws on principles of operant conditioning and seems to have potential for treating stress-related problems. Bodily functions (e.g., heart rate, blood pressure, brain-wave activity) are monitored, and information about them is fed back to the client, allowing her/him some control of these bodily functions.
Token economy programs
A form of behaviour modification often used in residential care settings and based on principles of secondary reinforcement. Rewards may be in the form of tokens, which can be exchanged later for primary/direct reinforcements (e.g., sweets, extra outings, watching favourite TV shows). Technique is usually used with adults or older children, who can make the association between the immediate but non-usable reinforcer, and the later more direct reinforcer.
Behaviorism
Assumption: behavioural processes being studied are the same/similar in all species.
Objectives of behaviourism:
* predict behaviour
* control behaviour
* Consider whether behaviour is adaptive—does the behaviour facilitate survival?
Albert Bandura’s Bobo doll experiment
In the 1960s, Albert Bandura proposed his social learning theory. According to this
theory, learning could also occur through the process of vicarious reinforcement.
Bandura’s social learning theory clearly requires cognitive processes to occur. Not all learned
responses are performed. Reinforcement plays an important role in which responses are
performed.
Experiment concluded that observation of a live or filmed aggressive model led to increased aggression in child observers.
Observational learning
Occurs when responding is influenced by others (models).
Classical and operant conditioning can occur through observational learning.
The four components of successful modelling for vicarious reinforcement
- Attention to the modelled response
- Retention in the memory of the elements of the modelled response
- Motor reproduction or the ability to carry out the modelled response, and
- Motivation or incentive to display the modelled response.
Modelling and aggression
Aggressive models appear in various places and situations such as mass media. How much violence should be allowed on TV is still a controversial subject.
Observational learning can account for influence of mass media on behaviour and why physical punishment increases aggressive behaviour.
behavioural enrichment’s two
main purposes
- provision of behavioural and cognitive challenges facilitates normal development, and physical
and psychological wellbeing, thereby enhancing ‘quality of life’. - promote retention of species-specific behaviours vital to survival (maintains behavioural
diversity), important for zoos involved in reintroduction programs.
Difference between natural and captive environments
In natural environments— animals can usually escape from severe conflict situations. For example, to
avoid fighting with a conspecific, the animal can offer appeasement/submissive gestures or flee.
In captive environments— animals cannot escape from conflict situations, which can lead to extreme
stress and may result in stereotypic behaviours.
Zoos provide limited space and control for animals.
Stereotypic behaviours
- ‘abnormal’ or aberrant behaviours
- repetitive behaviour patterns
- and have no obvious function or goal & can be indicative of a welfare problem.
Examples include: “pacing, head flicking, weaving, bar gnawing, crib biting, wind
sucking, spot pecking and many other normal behaviours which are performed for an excessive length of
time or in inappropriate contexts”