Unit 3 - Ch. 5 Flashcards
Another name for negative reinforcement
escape training
Shaping
Shaping is the reinforcement of successively closer approximations of a desired behaviour.
In shaping, it is sometimes a good idea to back up– ie, to reinforce earlier approximations of the desired behaviour.
Discrete training
Performance of a behaviour defines the end of a trial.
Operant training procedure.
The training procedure Thorndike used in his famous experiment with cats is best described as a discrete trial.
The dependent variable is usually related to how long it takes a participant to reach the end, the number of errors before getting there, or the number of times a behaviour was performed within the time frame.
Positive reinforcement is also called
reward training
Response deprivation theory
A behaviour becomes reinforcing for an organism when the organism is prevented from engaging in that behaviour at its normal frequency.
Schoolchildren are eager to go to recess because they have been deprived of the opportunity to exercise.
Law of effect
The law of effect says that behaviour is a function of its consequences (behaviour changes in relation to how the consequences change)
Notation for reinforcement
B–>SR
Relative value theory
The reinforcement properties of an event depend on the extent to which the event provides access to high probability behaviour.
High probability behaviour can be used to reinforce low probability behaviour.
Limitation: low can be favoured if it’s what you’ve been deprived of.
Premack’s name is most logically associated with relative value theory.
John Nevin says reinforcement gives behaviour
momentum
Premack is associated with
Relative value theory
Chaining
A chaining procedure is a series of steps to reinforce a behaviour chain. The first step is called task analysis: you break down the task into its component elements, identifying each link in the chain.
Chaining is a useful procedure for shaping behaviour in laboratory animals, and it is important in shaping the behaviour of wildlife
Extinction procedure leads to:
increase in variability of behaviour, increase in irritability, short-term extinction burst increase in behaviour
Another word for operant
Instrumental.
Another word for operant is instrumental (the behaviour is instrumental in producing the consequences)
Resurgence
The reappearance of previously reinforced behaviour during extinction is called resurgence. (reintroduce some other thing that worked in the past- pecking if flapping on extinction)
Sidman avoidance procedure
The distinctive characteristic of the Sidman avoidance procedure is that the aversive is not signalled.
Connectionism
Thorndike speculated that reinforcement strengthened bonds between neurons, a view that many cognitive scientists have now embraced and called connectionism
Contingency square
A contingency square is a grid with strength of behavior (x) and consequence (y) (stimulus is presented or removed) axes.
Positive Reinforcement, Positive Punishment
Negative Reinforcement, Negative Punishment
3 essential features of reinforcement
behaviour must have a consequence; behaviour must increase in strength/occur more often; increase in strength must be a result of that consequence.
Tips for shaping behaviour
- Reinforce small steps
- Provide immediate reinforcement
- Provide small reinforcers. (Too much food takes too long to eat)
- Reinforce the best approximation available
- Back up when necessary
Operant learning
Bevaviour operates on the environment.
Behaviour is strengthened or weakened by its consequences. The behaviour is typically instrumental in producing these consequences– so this type of learning is also called instrumental learning.
Reinforcement
Reinforcement is the procedure of providing consequences for a behaviour that increase or maintain the strength of that behaviour.
Escape training
Escape training is the reinforcement of a behaviour to end an aversive stimulus. For example, coming in out of the rain so you don’t get soaked.
Avoidance training
What reinforces your behaviour involves preventing or postponing an aversive stimulus. This might be not going out when you see or read that it’s about to rain.
Free operant procedure
A free operant procedure is associated with Skinner. The behaviour may be repeated any number of times, so there isn’t an “end” in the same way there is with a discrete trial procedure. For example, a participant may push the lever in one of Skinner’s boxes many times for food within a single session/experiment.
Compare and contrast operant and Pavlovian conditioning.
In operant conditioning, a stimulus (the reinforcing or punishing consequence) is contingent on a behaviour. It usually involves voluntary behaviour.
Pavlovian conditioning involves one stimulus (the US) that is contingent on another stimulus (the CS). It mostly involves involuntary/reflexive behaviour.
Though different Pavlovian and operant experiences often happen together. The distinction is tough when, as evidenced in the case of Albert and the rat: the fact that Albert reached for the rat just before the loud noise occurred means that operative learning was involved in addition to Pavlovian conditioning.
What are primary reinforcers?
Primary (unconditioned) reinforcers are those that are not dependent on their association with other reinforcers. Examples would be food, water, sexual stimulation, stimulation of the brain’s “pleasure centres”, relief from hot/cold, and certain drugs.
What are secondary reinforcers?
Secondary reinforcers depend on their association with other reinforcers. Examples include praise, recognition, smiles, and positive feedback. These reinforcers are conditioned.
What are the advantages of secondary reinforcers over primary reinforcers?
- Primary reinforcers lose their reinforcing value quickly (if you’re full, hunger works less and less)
- It’s easier to reinforce behaviour immediately with secondary reinforcers (ie clicker versus walking over with food)
- Conditioned reinforcers are less disruptive (don’t take time)
- Conditioned reinforcers can be used in many situations, including when subject isn’t hungry or thirsty
What are generalized reinforcers?
Generalized reinforcers are those conditioned reinforcers that have been paired with a number of primary reinforcers and can be used in a variety of situations (such as money).
What are the 2 types of chaining procedures?
Forward chaining is when you build each successive link in the chain as you reinforce always the furthest step in the chain. You’d start with reinforcing step 1, then when they did step 2 reinforce only when they reach that, and so on.
Backward chaining is when the training starts with the last link, and backs up to start with earlier and earlier links.
With both, if the next behaviour isn’t happening, you reinforce the closest approximation (called shaping) until they reach the full behaviour for each step.
What conditions affect the effectiveness of a reinforcer?
- size (bigger better to a point)
- Task characteristics (some things are just harder to reinforce)
- Deprivation level (how hungry?)
- Prior learning
- Competing contingencies
Hull’s drive-reduction theory
Empty explanatory concept.
According to Hull, a reinforcer is something that reduces a drive. Drive’s are in effect primary reinforcers that reduce physiological needs. The problem is that there are reinforcers that don’t reduce physiological needs, but nonetheless work as reinforcers.
2-process Theory of Avoidance
Both Pavlovian and operant learning are involved in avoidance learning. The escape is negatively reinforced operant learning, but eventually Pavlovian conditioning comes in, as any trigger or sign that the negative reinforcer is coming becomes the CS for fear.
1-process Theory of Avoidance
One-process theory says that avoidance involves only operant learning. The reinforcer in avoidance is the reduction in exposure to the negative stimulus. Evidence for this is found in the fact that preventing both the avoidance behaviour and the consequences results in extinction of the avoidance behaviour.