final exam Flashcards
What kind of association (e.g., S-R, R-O, or S-O) did Thorndike believe was the most crucial in instrumental learning? Explain what he thought the role of R was in instrumental learning. Although these ideas are generally not well-supported at this time, how can Thorndike’s ideas inform the study of habit behaviour and/or drug addiction?
Thorndike thought that the S-R association was the most crucial in instrumental learning. The role of the reinforcer was to strengthen the S-R association. Thorndike’s ideas can inform the study of habit behaviour and/or drug addiction because habits were seen to be automatic responses to the S in which a goal was previously obtained. So in drug addiction, it is the S-R without the O.
a) Explain why the S-O association in instrumental conditioning can be considered very similar to emotional (classical) conditioning. What expectation is formed by this association?
b) What are the two processes in Two-Process Theory? Explain why, according to this theory, classical conditioning mediates instrumental conditioning. Give a real-life example of instrumental conditioning and break it down into the two processes of Two-Process Theory. How do specific reward expectancies influence instrumental behaviour?
a) during instrumental conditioning, stimuli(S) become associated with response outcome (O;reinforcement) through classical conditioning (S-O association). The resulting emotional state (like a CR) motivates the instrumental behaviour.
b) The two processes in the two-process theory is the S-O association that produces classical conditioning and the operant/instrumental conditioning. Classical conditioning mediates instrumental conditioning because it brings a positive feeling from the outcome. An example of the two-process theory would be:
S = favourite relative’s house
R = help favourite relative
O = a delicious meal
Organisms tend to acquire specific reward expectancies. For example, if a mouse is classically conditioned to receive food pellet, they will expect that food pellet from here on out.
a) Is the R-O association actually important in instrumental learning? How could its importance be demonstrated experimentally?
b) Compare how the S-R and R-O associations can contribute to drug addiction.
a) Yes it is important, for example if you devalue the reinforcer after conditioning, people will choose to work for other reinforcements instead.
b) S-R mechanisms are activated by the drug-related cues (habit; predominant later in addiction). R-O mechanisms represent drug-seeking behaviour (seeking reinforcement; predominant earlier in addiction - fade out later in addiction)
What is an S(R-O) association, and how is it different from an S-R association? It might be helpful to explain using an example.
S’s activate R-O associations in addiction to activating R’s. For example, I see a vending machine (S), I out money in (R) and I expect the food (O). None of this would have happened if I didn’t see the vending machine first (S).
Explain how you would need to use different approaches to produce stereotypy of responses (consistent/similar responses), vs. variability or creativity in responses, using reinforcement. Give a real-life example that demonstrates how either outcome could be achieved.
You would need to design the reinforcement in a way that line up with natural behaviour. For example, using food as a reinforcer when an animal is hungry would work better than when it is not hungry. For promoting creativity and variability in responses, you would need to choose activities where creativity and variability is normal or beneficial, for example, art projects.
How can belongingness, instinctive drift and behaviour systems theory explain why reinforced behaviours can be different, or less, than those expected or desired?
a) belongingness = certain responses belong with the reinforcer because of evolutionary history
b) instinctive drift = animals may revert to automatic behaviour that interferes with attempts to reinforce other behaviours
c) behaviour systems theory = if an animal is food-deprived and might encounter food, its feeding system is activated
For each of the following examples, identify which option within would likely produce stronger reinforced behaviour and briefly explain why, including any relevant terms you have learned:
a) Being paid $25, versus being paid $100, for each “A” grade earned in school.
b) Giving praise such as “good work!” versus “it is clear you put a lot of effort into crafting the carefully-considered arguments in this paper.”
c) Giving a toddler a cookie 5 seconds after they shared a toy with another child, versus giving a cookie 6 hours after they shared the toy.
- What is the credit-assignment problem?
d) Receiving 1% bonus credit for a short assignment versus receiving 1% bonus credit for a short assignment after having received 2% bonus credit for the same short assignment on a different topic in the same class one month prior.
a) the bigger the reward, the more likely a response will occur
b) higher quality reinforcer is more affective than lower quality
c) the faster the reward is given, the better
d) the first option is better because receiving a reward that is less than the one previously received is not as favourable
What is a schedule of reinforcement? What is the difference between a continuous vs. a partial (intermittent) reinforcement schedule? Give real-life examples of continuous and partial reinforcement schedules. Are partial reinforcement schedules less effective due to using less reinforcement?
- schedule of reinforcement = is a program or rule that determines which responses are followed by reinforcers.
- continuous reinforcement schedule = every single response is followed by the outcome
- partial (intermittent) reinforcement = only some responses are followed by the outcome
- example of continuous reinforcement = turning on a kettle
- example of partial reinforcement = gambling
- partial reinforcement schedules are not less effective just because there is less reinforcement.
Explain how each of fixed ratio, variable ratio, fixed interval, and variable interval schedules of reinforcement are organized. Give a real-life example of each. Are these continuous, or partial, reinforcement schedules? How could you carry out an FR5 schedule? A VR3 schedule? An FI2 schedule? A VI4 schedule?
- fixed-ratio = only every # of responses results in an outcome
- variable ratio = reinforcement occurs after an average number of responses
- fixed interval = after # of seconds, a response leads to an outcome
- variable interval = after # of seconds, a response produces its output, but the interval length varies (it has an average length)
- these are partial schedules of reinforcement
- FR5 = every 5th response you give an outcome
- VR3 = an average of 3 responses leads to an outcome - the person does not know the exact amount
- FI12 = wait 12 minutes to give an outcome
- V14 = wait an average of 14 minutes to give an outcome - the person does not know when
Explain why:
a) VR schedules tend to produce the highest rate of responding.
b) There are post-reinforcement pauses in FR schedules.
c) There are similar patterns of responding between FR and FI schedules, using the term “ratio run.”
d) Interval schedules are more limiting than ratio schedules with respect to the amount of reinforcement that can be obtained (feedback function).
a) due to # of responses determining amount of reinforcement (no waiting for an interval to expire) and no post-reinforcement pauses
b) the person/animal feels content with the reward that they just got
c) it is more predictable
d) intervals impose an upper limit on how many reinforcers may be obtained
What are concurrent schedules? What does the Matching Law demonstrate? How can response bias interfere with what is predicted by the Matching Law?
- concurrent schedules = more than one reinforcement schedule is active
- matching law = present two or more variable interval schedules for different behaviours simultaneously
- response bias = one response or reinforcer is preferred over the other - does not match
What is the relationship between delay discounting and self-control? According to the concept of delay discounting, how valuable is a reward obtained immediately versus an equivalent reward obtained a month from now? How could you study delay discounting in a lab with animals?
- self-control is choosing a large delayed reward over an immediate small reward and delay discounting is the idea that the value of a reinforcer declines as function of how long you have to wait to obtain it - they happen simultaneously.
- the longer you wait, the less enticing the reward is.
- in the lab, you could give rats a choice between two levers: one that delivers a small, immediate reward (1 food pellet) and one that delivers a larger reward (3 pellets) after some delay. if you give a short delay (0-5 seconds), the rats would choose the larger reward on nearly 100% of choice trials. but if you give a longer delay (20-30 seconds), rats are less likely to wait and will take the smaller immediate reward.
Compare Premack’s Principle versus the Response-Deprivation Hypothesis. How does each explain what will be reinforcing? Give a real-life example of a reinforcer that can be explained by each. Can both approaches be used to explain why the same reinforcer works? Try it!
- Premack’s principle = reinforcers involve high-probability activities (can use anything a person does frequently as a reinforcer) and instrumental responses are typically low-probability activities (any behaviour can serve as a reinforcer if it is more likely than the instrumental response)
- response-deprivation hypothesis = restriction of reinforcer activity is what makes it valuable - any activity (high or low probability)
- example of premacks principle = wash the dishes and you can watch some tv after
- example of response deprivation = if you’re not allowed to play a certain sport, you will want to play the sport more and more
With respect to the Response Allocation approach, define Behavioural Bliss Point and compare your own behavioural bliss point with that of a friend or family member – what is similar between you, and what is different?
behavioural bliss point = preferred response allocation (pattern of activities) when there are no restrictions
How do new reinforcement schedules interact with behavioural bliss point? What ELSE needs to be considered in addition to the one new reinforcement schedule in order to more accurately describe behaviour (hint: adding a new reinforcement doesn’t necessarily remove other possible reinforcements).
- introducing a response-reinforcer contingency can make attaining Bliss Point impossible
- but organisms try to redistribute behaviour to get as close to it as possible (compromise)
- alternate available reinforcements can undermine the reinforcement schedule
Briefly describe how psychology can be explored using economics principles within the field of Behavioural Economics in consideration of consumer demand. What is meant by elasticity of demand? What are three factors that help to determine elasticity of demand? Can you think of psychologically-related examples of each?
- economics - relationship between the money we have and prices in the marketplace
- psychology - relationship between time/effort available and time/effort required to earn reinforcements
- elasticity of demand = degree to which prices influences demand, or number of required response influences reinforcers earned - sometimes people continue to buy (or respond) even if the price goes way up
- three factors:
1. availability of substitutes
2. price range
3. income level
Describe the basic approach to conducting extinction for classical conditioning and for operant conditioning. How is each form of extinction different from forgetting? Why isn’t extinction the opposite of conditioning?
- extinction for classical conditioning = repeated presentations of CS but no US
- extinction for instrumental conditioning = no longer providing the reinforcer after response
- forgetting does not require the lack of the US for classical conditioning or no reinforcement for responses in instrumental conditioning
- extinction isn’t the opposite of forgetting because it is a new form of learning - you are not unlearning what you already learned
What is exposure therapy? Is it linked more closely to classical or to operant conditioning? What does it attempt to accomplish? Describe two real-life applications of exposure therapy.
- exposure therapy = lots of experience with a CS (feared stimulus) in absence of the US (actual aversive stimulus/danger) in order to bring about extinction of the fear response (CR)
- it is linked more closely to classical conditioning
- being afraid of spiders, being afraid of rejection
What are two main behavioural effects of extinction in operant conditioning (OC)? What is a frequent (negative) emotional effect of extinction of a reinforcer? How can the emotional effect influence behaviour?
- two main behavioural effects:
- decrease target response
- increase response variability
- frustration is a frequent (negative) emotional effect of extinction of a reinforcer
- emotional effect of extinction energizes behaviour
define spontaneous recovery
- rest period after extinction training
- responding comes back
what needs to happen between extinction and recovery to allow for this recovery to occur?
- nothing specific is done during rest period to produce recovery
example of spontaneous recovery
work on fear of spiders, take a break and then your fear may pop up again down the road
define renewal
recovery of conditioned responding after extinction due to change of context
what needs to happen between extinction and recovery to allow for this recovery to occur?
conditioned response returns in a new context