Unit Three Flashcards
Shaping definition
A method for generating new behavior in which responses that are increasingly like the goal behavior are successively reinforced
Thorndike studied animal learning as a way of measuring what
Animal intelligence
To reinforce a behavior is to provide what for the behavior to increase its what
Consequences , strength
Positive and negative reinforcement have what in common
Both strengthen behavior
In the discrete trial procedure what ends the trial
The behavior
How did some operant conditioning occur with Albert and the white rat
Albert reached for the rat just before the loud noise occurred means operant conditioning occurred
Weil wanted to separate the effects of what of reinforcement and what of reinforcers
Delay and number
In general the more you increase the amount of reinforcer the —- benefit you get from the increase
Less
According to premack principle, blank behavior reinforces blank
High probability/ likely , low probability unlikely behavior
According to the response deprivation theory school children are eager to move at recess because they have been deprived of the opportunity to
Move about / exercise
The two process in two process theory are blank and blan
Pavlovian conditioning and operant learning
Thorndikes chick in a maze
Put a chick in a maze with the correct route it would find food and other chicks. With succeeding trials the chick became more efficient snd then the appropriate route
Thorndikes hungry cat in a puzzle
Put a cat in a box with food out of reach needed to pull a wire loop to get the food . Eventually it would accidentally pull the loop. After that the ineffective moved decreased dramatically
A steep learning curve shows what
Rapid learning and a easy task
Law of effect definition
Behavior is s function of its consequences as defined by thorndike
Connectionism
Thorndike speculated that reinforcement strengthened bonds or connections between neurons , a view that became known as Connectionism
Operant learning definition
Behavior is strengthened by its consequences
Instrumental learning
The behavior is typically instrumental in producing these consequences
Two differences between operant and Pavlovian
Operant is not reflexive like Pavlovian and is often complex. In operant the organism acts on the environment and changes it s d the change the. Strengthens or weakens the behavior
. Pavlovian is more passive where operant is active
Contingency squares how many are there
4 positive and negative reinforcement and positive snd negative punishment
Reinforcement definition
Is the procedure of providing consequences for s behavior that increases or maintain the strength of the behavior
Three characteristics of reinforcement
The behavior must have a consequence
The behavior must INCREASE in strength
The increase in strength must be due to the consequence
Positive reinforcement
Positive reinforcer
A behavior is followed by the appearance of or an increase in the intensity of s stimulus . This stimulus called s positive reinforcer is ordinarily something the organism seeks out.
It then strengthens the behavior
Reward training
Sometimes used to refer to positive reinforcement because often rewards are used as positive reinforcers. HOWEVER SOMETIMES AVERSIVES CAN BE USED ex electric shock
Negative reinforcement and negative reinforcer
A behavior is strengthened by the removal of or s decrease in the intensity of s stimulus . The stimulus is the negative reinforcer something that the organism usually tries to escape or avoid
Escape training
Aka negative reinforcement becaus they are escaping snd aversive
Discrete trial
Measures
What thorndike used.the behavior of the participant ends the trial and is then returned to the start
Measures- often time to complete or number of errors
Skinner box
Skinner created s box that had a lever rats had to push to get food
Free operant procedure
Dependent variables with it
Fees operant procedure the behavior may be repeated any number of times. Ex skinners box the lever was pushed any number of times
Usually the number of times s particular behavior such as pecking occurs per minute
Advantage of free operant procedure
More natural and less intrusive
Most important difference between Pavlovian and operant
In Pavlovian the U.S. Is contingent on another stimulus the CS where in operant a stimulus is contingent on behavior
Reflexive usually uses what system whereas voluntary usually involves
Reflexive is usually with Pavlovian , autonomic nervous system, and smooth muscles and glands. Voluntary usually with operant snd associated with voluntary nervous system and skeletal muscles
Is it hard or easy to distinguish between operant snd Pavlovian
No
Primary reinforcers
Naturally or innately reinforcing ( usually) but are not dependent on their association with other inforcers
Very powerful snd limited in number
Ex. Food ester sex electrical stimulation of the brain, relief from hot and cold, and certain drugs
Secondary reinforcers
Dependent on their association with other reinforcers . Ex praise recognition, smiles, snd positive feedback . They are secondary to other reinforcersr. Dependent on primary reinforcers
Aka conditioned reinforcers
Four advantages of secondary reinforcers
Become less effective over time slower than primary reinforcers.
Often much easier to use to reinforce behavior than other reinforcers.
Less disruptive then primary snd take less time
Can be used in many different situations
General reinforcer definition
Have been paired with many different kinds of reinforcers in as variety of situations .ex money
Main disadvantage of secondary reinforcers
Their effectiveness depends on their association to s primary one. It may lose its effectiveness if it doesn’t work anymore ex money becomes worthless
Why is shaping sometimes related to tantrums
The parent gradually demands more and more outrageous behavior to give them what they want
5 tips for shapers
1) reinforce small steps
2) immediate reinforcement
3) small reinforcers
4) reinforce the best approximation available
5) back up when necessary
Behavior chain what is it snd what’s it called when you train for it
Connected sequence of events ex. Gymnastics routine.
To train its Called chaining
First step of chaining
Break the task into its components called task analysis
Two ways to chain after task analysis
1) forward chaining : the first link in the chain is reinforced then until it’s without hesitation, then the next steps sequentially
2) backward chaining : begin with the last part of the chain and work backwards note the chain is never performed backwards they just start s step further towards the start each time
The last step in the chain is usually
The last step usually produces a reinforcer that is often s primary reinforcer
Contingency for operant conditioning
Degree of correlation between the behavior and its consequence
Rate of learning is dependent on the contingency want it to follow 100% of the time
Contiguity with operant conditioning
Gap between w behavior snd its reinforcing consequence, generally s shorter gap means faster learning because they can’t get confused about which action is reinforced
Weil showed that the gap was important by using a constant rate of reinforcement
Three characteristics of reinforcers
I1) small reinforcers given frequently will result in faster lesrnong then s large infrequently. However, the size if all else is the same results in faster learning. Ex 100$ vs 5$
2) however reinforcer size/ magnitude is not linear. The more you increase the less benefit you get from the increase
3) qualitative differences in reinforcers: identifying preferred reinforcers can make s difference
How do task characteristics affect operant learning
Tasks that are easier. Smooth muscles are harder to modify than skeletal muscles but had been done
How does deprivation level operant learning
The greater the level of deprivation the more effective the reinforcer . Mainly important when it alters a physical condition
Two other variables that affect operant learning
Learning histories
Competing contingencies - if the behavior also produces punishing consequences
Extinction with operant learning
Withholding the consequences that reinforce s behavior
Extinction burst with operant conditioning
An abrupt increase in behavior upon extinction .nthen generally followed by decline in try behavior
Variability of behavior with operant conditioning
Increase in variability of behavior because they are trying to get the previously reinforced behavior
Emotional behavior with extinction
Often an increase in aggression
Resurgence
Resppearance of previously reinforced behavior upon extinction
Not spontaneous recovery is with extinction of Pavlovian
Resurgence can be used to understand what
Regression- the tendency to return to primitive infantile modes of behavior ex. Man has tantrum
Factors that affect the rate of operant extinction
Number of times the behavior was reinforced before it was extinguished
The effort the behavior requires
The size of reinforcers used during training
Behavior is learned how quickly and extinguished how quickly
Behavior is usually acquired quickly snd extinguished slowly
Can reinforcement ever be completely extinguished
Not really, it will likely occur at a rate above baseline
What did thorndike conclude about practice
That practice is only important if is reinforced
Hull’s drive reduction theory
Motivations, states are called rives. A reinforcer is s stimulus that reduces one or more drives. This theory works well with primary reinforcers . However, secondary reinforcers do not necessarily satisfy physiological needs. Some cannot be classified as primary or secondary. Ex male rats will take mating opportunity even if they can’t ejaculate. Huge criticism and weakeness with this theory
Relative value theory
By premack
Said that in any given situation some kinds of behavior have a greater likelihood of occurrence than others. Thus different behaviors have different relative values and determine the reinforcing properties of s behavior.
Need to know the relative values of the activities.
Premack principle
High probability behavior reinforces low probability behavior. You can make the high probability behavior contingent on the low probability behavior snd increase the likelihood of the lower probability behavior.
Criticism for relative value theory
Does not explain why the word yes is enforcing ,
Low probability behavior will reinforce high probability behavior only if the low probability behavior had been prevented from being performed for some time
Response deprivation theory
Behavior becomes reinforcing when the organism is prevented from engaging in its normal frequency ( falls below baseline level).
Fault with response deprivation theory
Doesn’t explain why words like yes etc can be reinforcing. Doesn’t explain why people performed better in thorndiked experiment when blindfolded
The example where dogs learned to jump when the light went out before the shock came is an example of what
Escape avoidance lesrning. First learn to escape and then eventually avoid the aversive behavior
Two process theory
Issues both Pavlovian snd and operant are important. Escaping the shock is negatively reinforcing but eventually the extinguished light becomes CS for fear ( the U.S. For fear is the shock)
Therefore there is only really escape with this theory it escapes the shock snd then the dark chamber
Three issues with two process theory
Fear of the CS decreases as the animal learns to avoid the shock. This means the tone should become less reinforcing snd should becomes extinguished. 2)However, it doesn’t extinguish
3) a study by sidman showed that rats pressed s lever to decrease the rate of shock .this is an issue because there is no escape
3)
Sidman avoidance procedure
There is no preceding stimuli. Shocks occur at regular intervals it rats csn delay the shock by 15sec to avoid the shock
Found that they pressed the lever to delay the shocks
Has been argued that time was the CS debated
One process theory
Proposes that avoidance involves only one process: operant learning both behaviors are reinforced by the reduction in aversive stimuli
The reduction in exposure to shock is reinforcing.
Learn to jump to avoid shock
Alan neuringer get demonstrated that with reinforcement —- could learn to behave randomly
Pigeons
In general the greater the number of reinforcements before extinction the
Greater the number of responses during extinction
Chaining is useful for wildlife?
True
Have efforts to reinforce contraction of individual muscle fibers failed?
No