Lecture 6 - Operant Conditioning Flashcards
Operant condition
(also known as instrumental conditioning and trial-and-error learning)
is associating a voluntary
behavior (‘operation on the environment’) with an outcome.
some action the animal chooses to do is associated with an outcome
Law of Effect
Animals learn that a behavior (or class of similar behaviors) predicts a particular outcome and seek the outcome by performing a particular behavior
Behaviors with good outcomes increase; behaviors with bad outcomes decrease.
(Thorndike, 1911)
Discrete trial paradigm
(Thorndike,
1911)
Cat opens the puzzle box and is reinforced with food reward.
Cat learned that flipped the switch was responsible for it getting out and getting food: So that escape behavior becomes more likely (and faster) in the future.
discrete because every-time the cat got out that was one trial and for each new trial the cat had to be put back in
B. F. Skinner
free-operant paradigm
refined Thorndike’s method to allow the animal to respond repeatedly
==> allowed the animal to control the rate of responding ==> animal controls when they get the reward (food)
• SKinner Box
Skinner Box
little contraption, everything was automated: counted the number times the lever was pressed, counted the number of times the reward was provided
made it easy to measure this activity over time
instead of recording trials you’re recording behaviors over time
• Behaviors could be automatically recorded
in a Skinner box – count number of behaviors and outcomes.
Acquisition
reinforcing behavior: giving reward for every time the rat presses the lever
the amount of responses goes up
extinction
it keeps pressing the lever but no food comes
if you stop reinforcing the behavior then the behavior starts to go away
amount of responses decreases
Basic elements of the free-operant paradigm:
- discriminative stimulus (S)
- behavioral response (R)
- outcome (O)
S –> R –> O
Through repeated trials, the animal learns that the outcome is contingent upon
the appropriate response.
discriminative stimulus (S)
that helps you select
the appropriate behavior (e.g. rat can see the lever).
the animal has to be able to ID something in the environment that it’s operating on
behavioral response (R
or class of similar responses,
is performed in response to the stimulus (e.g. rat pushes lever with either paw).
outcome (O)
follows that either reinforces or punishes the behavior (e.g. rat gets food, good outcome).
reinforcers
Outcomes that increase the likelihood of the behavior
primary reinforcers
secondary reinforcers
primary reinforcers
meet some innate need (e.g. food, water, sleep, and sex).
Note that these are not always reinforcing (i.e.
you won’t work for water if already satiated).
Secondary reinforcers
have no intrinsic value, but predict or are associated with primary reinforcers (e.g. money, good grades, gold stars, etc.).
something by itself has no value but through some kind of association it’s learned that this other thing is valuable
punishers
Outcomes that decrease the behavior
primary punisher
secondary punisher
Primary punisher
Pain (shock), nausea, loud noises, social disapproval (?), loss of freedom (jail).
basically just aversive things
Secondary punisher
Monetary fines, demerits, bad grades, etc.
You are about to press a button on your iClicker. When
you see that you got the correct answer to the question,
that acts as a ______________.
Secondary reinforcer
positive (+) conditioning
If an outcome/consequence is added, if you’re given an outcome as a result of your behavior
this has nothing to do with “good” or “bad.”