Learning curve Flashcards

1
Q

rule of 10 000 hours of practice

A

idea that after 10 000 hours you would become expert in the field

no empirical data to support this theory

actually refuted

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

descriptive models

A

just fit the data - doesn’t tell anything about underlying cognition

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

cognitive models

A

parameters actually mean sth -> underlying mechanism in terms of cognition, derived from psychological theory

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

exponential model

A

P=1-exp(- μ*t)

where
P - performance scaled between 0 and 1 (proportion correct)
t - trial run
u - learning rate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what are characteristics of exponential model?

A

learning very quickly at the beginning, then plateau

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

fitting model to the data

A

estimate model parameters given the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

concave model of law of practice

A

both power and exponential functions are concave -> decelerating curve!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

hyperbolic function

A

P = t/(t+d)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is Gaussian noise?

A

type of statistical noise where probability density function (PDF) follows normal distribution
can be simulated with stats.normv or random.normal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What did Estes assumed about exponential law of practice?

A

change in performance over time depends on total performance yet to be achieved - elements to be learned

dP/dt = u(P max - P)

dP/dt = changing performance over time
P max - maximum performance
P - current performance
u - learning rate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is alternative to concave models?

A

s-shape learning function!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

when should you use concave exponential function?

A

P = 1 - exp(-ut)
while learning single words (items)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

when should you use compound exponential function?

A

P = ( 1 - e **-ut)c

when one has to learn sets of c words (fragments)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

maximum likelihood function

A

used to estimate parameters of probability distribution (pdf) by maximizing likelihood function, so that under the assumed statistical model the observed data is most probable

in short: given the model, find parameters for which data are most probable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

probability

A

Prob(data/model, parameters)

data - people who pick option x

given

model - number of people asked
parameters - probability of picking option x

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

likelihood

A

Likelihood(parameters/model, data)

parameters - probability of picking option x

given

model - number of people asked
data - people who pick option x

17
Q

what is log likelihood? why is it prefered for maximum likelihood calculations?

A

natural logarithm of likelihood function

-> it turns products into sums, making complex likelihood functions easier to deal with
(it pushes large numbers down, avoiding infinity calculations)

-> smooths out numerical instability issues that may occur when multiplaying small probabilites

18
Q

How to use optimization to find maximum likelihood estimate?

A

the idea is that you want to find the optimum (max or min) - which is similar to finding deepest point in the lake

you can use

derivatives = give you local information about the slope or the direction of the function at a given point
positive derivative = function is increasing in this direction
negative derivative = function is decreasing in this direction

concavity (2nd derivative) = helps to inform whether you are in concave region (bowl) or convex region (hill)
- helps to get an idea how near you are to minimum/maximum

19
Q

What is local optimum?

A

it is illusory ‘‘deepest lake point’’ - so it is lower, but not the lowest point in the lake

you can use special algorithms like simulated annealing or genetic algorithms to avoid it

20
Q

What can we done instead of maximizing the likelihood?

A

You can minimize! -> then you minimize NEGATIVE log-likelihood

21
Q

multiple regression
y = 3 - 0.2 x + 0.5 w
- what is B0?

A

3 = intercept!
baseline value of DV y when both x and w are zero

22
Q

multiple regression y = 3 - 0.2 x + 0.5 w
- what is B1?

A

0.2 = slope/effect of predictor x on y
how much y changes when x changes by one unit

23
Q

multiple regression y = 3 - 0.2 x + 0.5 w
- what is B2?

A

0.5 = slope/effect of predictor w on y
quantifies how much y changes when w changes by one unit

24
Q

what is sigma?

A

standard deviation
assumes that residuals follow normal distribution (important because likelihood is based on assumption that errors -residuals- are normally distributed with sd)