Lecture 10 Flashcards

Question 1

Q

What is learning?

Answer

A

Not just change in behaviour over time - as learning is not the same as development e.g you get taller but you didn’t learn to get taller
Learning is change in behaviour due to reinforcement via observational learning and classical conditioning
Can improve behaviour over time via evolution and survival of fittest = not active but passive as evolution does the work
In decision making = maximising reinforcement/utility
Learning is improved decision making over time

Question 2

Q

How to improve decision making?

Answer

A

Better probability estimation via
Subjective probability and objective reality, coherence and correspondence
Better utility estimation: expected vs experienced utility: being able to estimate how much you like something vs how much you actually liked them

Question 3

Q

What is an example of learning?

Answer

A

Two options: Act 1 or Act 2
You will wither win/lose, and repeat decision over time - remove outcome uncertainty by gaining additional info

Question 4

Q

What is the linear model of learning?

Answer

A

Two possible behaviours WITH reinforceable probabilities
Choice probabilities at time t, as time passes and more trials have passed, probabilities for reinforcement get bigger
At a given time, probability that picking act A will be reinforced, you win/lose, and lambda is the learning rate
In every given trial there is an error between anticipation and reality = still had a probability between win/lose, and will get smaller as you are adjusting probability to get better
Teacher value is what happened - probability estimate so with enough time the estimated probability approach true = decision making improves over time

Question 5

Q

What is the Rescorla Wagner Model with nonlinear response mapping?

Answer

A

Replace external response probabilities with internal response weights, weights are adjusted based on reinforcement on trials
Predicts nonnormative probability matching: the probability of a person choosing either option converges with the weighted probabilities, and can tell If behaviour is probability matching and is learning (normative)
Nonlinear response function predicts maximising e.g pick one option 100% of time = but not engaging at probability matching
Response function replaces probability estimate with weight = changes the overall probability

Question 6

Q

What is calibration?

Answer

A

Forecasters were too cautious and overemphasised uncertainty e.g saying pain = 0.6 meant rain = 0.8 true probability
Based on feedback, weather predictions are more calibrated with more information = subjective probabilities matched objective probabilities

Question 7

Q

What is a nonnormative learning example?

Answer

A

The inverse Base-rate effect
Has training trials with corrective feedback, and condition 2 = trials without corrective feedback (leads to novel combination of symptoms)
Have two diseases: one of them is more common than the other
In this condition, person can only have 1 of these diseases
Patients with common disease have two symptoms = one of which is perfectly diagnostic of common disease, other symptom is not commonly diagnostic but people with rare disease have it too - imperfect predictor
People learn to correctly predict when patients have common/rare diseases and perfectly accurately
After training and introducing them to new patients = where they have both predictive symptoms = ppts pick the rare disease = valid cues but both conflict with each other so should follow base rate
Ppts learn to predict things that happen a lot because it occurs a lot with the combination of cues, the patient with rare cue, you pay more attention to the predictive cue only. Combination of test cues do not match what people have learned but matches what people think for the rare disease better than the common disease

Question 8

Q

Why does feedback not fix everything?

Answer

A

Inverse base-rate effect is more than base-rate neglect
Base rate neglect predicts 50/50 response but inverse base rate effect is more rare responding than common
Inaccurate representation leads to nonnormative response probabilities

Question 9

Q

How to have better utility estimation?

Answer

A

Update expected utilities based on actual utilities
If you want to know about the utility of some event = don’t predict how you will feel due to biases = look at how someone else experiencing that event now feels

(9 cards)