Instrumental/Operant Conditioning Flashcards
what is instrumental/Operant conditioning?
learning through reinforcement, a learning process where voluntary behaviours are modified by association with the addition of reward or aversive stimuli
what is shaping?
a form of behavior modification based with operant conditioning. Through the process of successive approximation, behaviors that are closer and closer to a target behavior are progressively rewarded with positive reinforcement
what is a discriminative stimulus?
a specific environmental cue that signals to an individual that a particular behaviour will be reinforced or punished i.e. a signal that tells an individual what to do in a particular situation
what is an example of operant conditioning using appetitive USs (like food)
Skinner and his box - place rats in box when lever is pressed food is dispensed into the box. Over time learn straight to go to the lever
what is an example of operant conditioning using aversive USs (like shock)
Thorndike and his puzzle box: Thorndike put hungry cats in cages with automatic doors that could be opened by pressing a button inside the cage. Thorndike would time how long it took the cat to escape. A food was placed outside the box, overtime they learnt how to press the lever to escape the box and get the food which was rewarding
how does operant conditioning differ from classical conditioning?
the learner is in control
what is the Law of Effect? Thorndike
- Of several responses made to the same situation, those which are accompanied or closely followed by satisfaction to the animal will be more likely to recur
- those which are accompanied or closely followed by discomfort to the animal will be less likely to occur.
- The greater the satisfaction or discomfort, the greater the strengthening or weakening of the bond.
how was thorndike’s view wrong?
- there must always be a stimuli present when we condition our response
- Throndike thought this stimuli was learned
- and also he thought that the unconditioned stimulus acted as glue to form a stimulus and response association
- once the association is formed, the stimulus always elicits the response but the animal doesn’t know why as it hasn’t learned the reward
differences between throndike and the modern view
Thorndike
- associate stimulus and response
- unconditioned stimuli not incorporated in learning
- respond because the stimulus is there so the value of the unconditioned stimulus is irrelevant
- its a habit
Modern View
- associate response and the unconditioned stimulus
- unconditioned stimuli incorporated in learning
- respond to get unconditioned stimulus because it has value
- a goal-directed action
what is positive reinforcement?
- process of encouraging or establishing a pattern of behaviour by offering reward when the behaviour is exhibited
- getting something good e.g. food
- increases likelihood of desired behaviour
what is negative reinforcement?
- taking away something occurs when something unpleasant or uncomfortable is removed or taken away in order to increase the likelihood of the desired behaviour
- adding something bad e.g. a shock
what is positive punishment?
adding an aversive stimulus after an unwanted behaviour to discourage a person from repeating the behaviour e.g. adding a shock
what is negative punishment?
taking something good or desirable away to reduce the occurrence of a particular behaviour e.g. cancel food
what are some operant conditioning techniques
- aversion = Responses followed by aversive USs (e.g. shock)
- escape = Responses rewarded by removing aversive USs (e.g. shock) after they’ve begun
- avoidance = Responses rewarded by removing aversive USs (e.g. shock) before they’ve begun
what is passive avoidance?
Often use a shuttle-box. These have two chambers; rat can move from one side to the other
- exploits a natural tendency of mice to enter dark environment
- rat must stay where it is to avoid shock i.e. must stay in light chamber
what is active avoidance?
Often use a shuttle-box. These have two chambers; rat can move from one side to the other
- mouse learns to avoid shock based upon the presentation of a light cue
- rat must move to other chamber to avoid shock
what is signalled avoidance?
Often use a shuttle-box. These have two chambers; rat can move from one side to the other
- explicit conditioned stimulus signal for shock e.g. buzzer
- whenever mouse hears buzzer mouse must move to the other chamber to avoid shock
what maintains avoidance response? Kamin, 1956
- Rat in chamber. Buzzer followed by shock; rat must respond to avoid the shock
- Four groups of rats in a shuttlebox; a buzzer signals the shock
- Group 1: Responses terminated buzzer and avoided shock
- Group 2: Responses avoided shock, no effect on buzzer
- Group 3: Responses terminated buzzer, no effect on shock
- Group 4: Matched buzzer-shock pairings, but responses do nothing.
results
- Animals learn most when buzzer terminated and shock cancelled
- Learn least when responses ineffective – (must be operant conditioning)
- Learn something even when they only terminate the warning CS, not the shock!
- both types of reward play a role
what is the avoidance response an example of?
an avoidance response
Once trained, avoidance responses very persistent, Solomon Kamin & Wynne (1953)
- Dogs in a shuttlebox: 1-s buzzer signalled shock, and barrier raised so they could jump over and avoid the shock. Even when shock cancelled, dogs went on jumping
- Response is a conditioned inhibitor predicting absence of shock
this can prevent conditioned stimulus from extinguishing
Soltysik et al. 1983 Cats. CSs signal mild shock.
Stage 1
- tone and clicker predict shock; light signals absence of expected shock
- Light is a conditioned inhibitor
Stage 2
- tone and click are presented without shock.
- This could allow extinction but tone is extinguished with an inhibitor present, click is extinguished on its own
- compare LOSS of fear to tone and clicker
Animals failed to lose fear to the tone when extinguished with an inhibitor
inhibitory light protected tone from extinction
what explains why avoidance responses?
- In an avoidance experiment the inhibitory response could protect the warning signal (buzzer) from extinction, even if there are no more USs (shocks)
- if response keeps on happening, buzzer stays frightening, keeps predicting shock, rat keeps on avoiding it
what is an everyday example of avoidance bahaviour?
OCD
- people develop persistent avoidance
responses
- maybe in the past they avoided something bad
- now the responses give relief even though no longer anything to avoid… and can ruin lives
what is appetitive reinforcement?
Responses followed by appetitive USs (e.g. food, sucrose)
2 ways of doing this:
1. Can give reward for EVERY response (continuous reinforcement)
2. Can give reward for only SOME responses (partial reinforcement)
what is a fixed interval schedule?
for example every minute
- reward a child for tidying their room by a trip to the chippy, but only on Fridays. Child tends to tidy his room on Thursday night!
- Fixed interval; responding occurs near the time of reinforcement
what is a variable interval schedule?
for example once per minute on average, but sometimes less sometimes more
- reward a child for tidying their room by a trip to the chippy, but randomly throughout the week. Child will keep his room tidy all week;
- low but steady rates of responding and is often used in experiments
what is a ratio schedule?
- reward a fixed number of responses
- can be fixed (e.g. every 10 seconds)
- or variable ( every 10 on avaerage, but sometimes more and sometimes less0
how is superstitious behaviour explained?
sometimes accidental pairings of a response and a reward produce a change in behaviour even though there is no reliable relationship in the world