Statistics 2 Flashcards

Standard error, bootstrapping, correlation coefficient.

1
Q

What did Hermann Ebbinghaus pioneer?

A

Learning/experience curves.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What shape does a learning curve of proficiency against effort expended take?

A

f

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the reasoning for the shape of a learning curve?

A

Learning how to do things more efficiently typically requires geometrically increasing effort and makes progress with positive, ever diminishing rewards.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

For power law:
y=ax^b

What returns are there when:
b > 1
b < 1

A

increasing returns

decreasing returns

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What does standard deviation measure?

A

The spread of values around their mean.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What standard deviation was used in this PowerPoint?

A

Sigma = population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is bootstrapping?

A

Resampling from a known sample to explore uncertainty about the sample mean.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is suggested when the bootstrapping method demonstrates uncertainty associated with the sample mean (getting very different results each time)?

A

It may not be a good representation of the true mean for the whole population which the sample was taken from.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How do you show the range of uncertainty for a value?

A

Standard error bars.

The amount of uncertainty is represented by the size of the error bar.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How do you calculate standard error?

A

The standard deviation (sigma) / the square root of the sample size.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

If a point appears anomalous and the sample mean lies within the standard error, what does this mean for the data point?

A

It is likely just a ‘blip’.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How can you measure the linear trend of your data?

A

Using Pearson’s product-moment correlation coefficient (r).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How do you work out r?

A

sum of ((values for x - mean of x)(values for y - mean of y)) / square root (sum of (values for x - mean of x)^2 x (values for y - mean of y)^2)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What kind of data is finding r appropriate for?

A

Normally distributed data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What does r measure?

A

Linear correlation of x and y = the strength of a linear trend for paired data points from sample y plotted against sample x.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are the possible values of r?

A

-1 to +1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What does it tell you if r is between -0.1 and +0.1?

A

Virtually no trend.

18
Q

What does it tell you if r is -0.3 or +0.3?

A

May be a trend.

19
Q

What shape is the line if r is -1 or + 1?

A

Perfectly straight line.

20
Q

Even after calculating r and obtaining a value of +0.1, why should you always do a plot of the data when exploring relationships?

A

Pearson’s correlation doesn’t always mean a lack of relationship.

21
Q

How can you check if a pattern occurred at random?

A

Randomise data values (like in bootstrapping) and calculate r.

22
Q

What can you conclude about a piece of data if 20 randomisations show no trend as strong as r = +0.68?

A

There is less than a 1 in 20 chance of seeing a trend this strong at random.

p < 0.05

23
Q

What is a p-value?

A

The likelihood of seeing a pattern at random.

24
Q

What p-values are we interested in?

A

p < 0.05 or p < 0.01 is even better.

25
Q

If p > 0.05, what do you do?

A

Report exact value and conclude non-significant.

26
Q

If 0.05 > p > 0.01, what do you do?

A

Report the exact value and conclude significant.

27
Q

If p < 0.01, what do you do?

A

Don’t report exact value and conclude (highly) significant.

28
Q

What does the significance of a correlation depend on?

A

It’s strength (r) and the size (n) of the sample.

29
Q

What combination of r and n gives a low p value?

A

high value of r and high value of n

30
Q

Correlation does not imply…

A

causation.

31
Q

When do you use a Mann-Whitney U test?

A

Aim = comparing samples.
Data = ranks/categories or measurements.

Data is not normally distributed (symmetrical about the mean).

32
Q

When do you use a t-test?

A

Aim = comparing samples.
Data = measurements.
Data is normally distributed.

33
Q

When do you use a chi-squared test?

A

Aim = comparing samples.
Data = frequencies.

34
Q

When do you use a regression test?

A

Aim = identify a trend/relation.
Variables = manipulated vs dependent.

35
Q

When do you use a correlation test?

A

Aim = identify a trend/relation.
Variables = Independent vs dependent.

36
Q

What is an alternative hypothesis?

A

A statement in hypothesis testing that proposes a difference, relationship, or effect exists in the population, challenging the null hypothesis.

Shown to be true/false by investigators.

e.g. the player’s scores in consecutive games are correlated.

37
Q

What is a null hypothesis?

A

A foundational concept in hypothesis testing that proposes there is no effect, no difference, or no relationship in a population or between variables.

It serves as the default or starting assumption that researchers aim to challenge or test against.

e.g. the player’s scores in consecutive games are not correlated.

38
Q

What do we do instead of proving an alternate hypothesis to be true?

A

We support a hypothesis by showing the null hypothesis is unlikely to be true.

39
Q

What does the p-value show in terms of the null and alternate hypotheses?

A

How unlikely our observed results would be if the null hypothesis were true.

40
Q

If our result is very unlikely e.g. p < 0.05 what do we do?

A

Reject the null hypothesis.

41
Q

What would 5 sigma indicate?

A

The observed statistic is more than 5 standard deviations from the mean.

For normal distribution, p < 0.0000003 (less than 1 in 3.5 million probability of seeing result by chance).