Statistics 2 Flashcards by Sophie Sandercock

What did Hermann Ebbinghaus pioneer?

Learning/experience curves.

How well did you know this?

Not at all

Perfectly

What shape does a learning curve of proficiency against effort expended take?

How well did you know this?

Not at all

Perfectly

What is the reasoning for the shape of a learning curve?

Learning how to do things more efficiently typically requires geometrically increasing effort and makes progress with positive, ever diminishing rewards.

How well did you know this?

Not at all

Perfectly

For power law:
y=ax^b

What returns are there when:
b > 1
b < 1

increasing returns

decreasing returns

How well did you know this?

Not at all

Perfectly

What does standard deviation measure?

The spread of values around their mean.

How well did you know this?

Not at all

Perfectly

What standard deviation was used in this PowerPoint?

Sigma = population.

How well did you know this?

Not at all

Perfectly

What is bootstrapping?

Resampling from a known sample to explore uncertainty about the sample mean.

How well did you know this?

Not at all

Perfectly

What is suggested when the bootstrapping method demonstrates uncertainty associated with the sample mean (getting very different results each time)?

It may not be a good representation of the true mean for the whole population which the sample was taken from.

How well did you know this?

Not at all

Perfectly

How do you show the range of uncertainty for a value?

Standard error bars.

The amount of uncertainty is represented by the size of the error bar.

How well did you know this?

Not at all

Perfectly

How do you calculate standard error?

The standard deviation (sigma) / the square root of the sample size.

How well did you know this?

Not at all

Perfectly

If a point appears anomalous and the sample mean lies within the standard error, what does this mean for the data point?

It is likely just a ‘blip’.

How well did you know this?

Not at all

Perfectly

How can you measure the linear trend of your data?

Using Pearson’s product-moment correlation coefficient (r).

How well did you know this?

Not at all

Perfectly

How do you work out r?

sum of ((values for x - mean of x)(values for y - mean of y)) / square root (sum of (values for x - mean of x)^2 x (values for y - mean of y)^2)

How well did you know this?

Not at all

Perfectly

What kind of data is finding r appropriate for?

Normally distributed data.

How well did you know this?

Not at all

Perfectly

What does r measure?

Linear correlation of x and y = the strength of a linear trend for paired data points from sample y plotted against sample x.

How well did you know this?

Not at all

Perfectly

What are the possible values of r?

-1 to +1

How well did you know this?

Not at all

Perfectly

What does it tell you if r is between -0.1 and +0.1?

Study These Flashcards

Virtually no trend.

What does it tell you if r is -0.3 or +0.3?

Study These Flashcards

May be a trend.

What shape is the line if r is -1 or + 1?

Study These Flashcards

Perfectly straight line.

Even after calculating r and obtaining a value of +0.1, why should you always do a plot of the data when exploring relationships?

Study These Flashcards

Pearson’s correlation doesn’t always mean a lack of relationship.

How can you check if a pattern occurred at random?

Study These Flashcards

Randomise data values (like in bootstrapping) and calculate r.

What can you conclude about a piece of data if 20 randomisations show no trend as strong as r = +0.68?

Study These Flashcards

There is less than a 1 in 20 chance of seeing a trend this strong at random.

p < 0.05

What is a p-value?

Study These Flashcards

The likelihood of seeing a pattern at random.

What p-values are we interested in?

Study These Flashcards

p < 0.05 or p < 0.01 is even better.

If p > 0.05, what do you do?

Report exact value and conclude non-significant.

If 0.05 > p > 0.01, what do you do?

Report the exact value and conclude significant.

If p < 0.01, what do you do?

Don't report exact value and conclude (highly) significant.

What does the significance of a correlation depend on?

It's strength (r) and the size (n) of the sample.

What combination of r and n gives a low p value?

high value of r and high value of n

Correlation does not imply...

causation.

When do you use a Mann-Whitney U test?

Aim = comparing samples. Data = ranks/categories or measurements. Data is not normally distributed (symmetrical about the mean).

When do you use a t-test?

Aim = comparing samples. Data = measurements. Data is normally distributed.

When do you use a chi-squared test?

Aim = comparing samples. Data = frequencies.

When do you use a regression test?

Aim = identify a trend/relation. Variables = manipulated vs dependent.

When do you use a correlation test?

Aim = identify a trend/relation. Variables = Independent vs dependent.

What is an alternative hypothesis?

A statement in hypothesis testing that proposes a difference, relationship, or effect exists in the population, challenging the null hypothesis. Shown to be true/false by investigators. e.g. the player's scores in consecutive games are correlated.

What is a null hypothesis?

A foundational concept in hypothesis testing that proposes there is no effect, no difference, or no relationship in a population or between variables. It serves as the default or starting assumption that researchers aim to challenge or test against. e.g. the player's scores in consecutive games are not correlated.

What do we do instead of proving an alternate hypothesis to be true?

We support a hypothesis by showing the null hypothesis is unlikely to be true.

What does the p-value show in terms of the null and alternate hypotheses?

How unlikely our observed results would be if the null hypothesis were true.

If our result is very unlikely e.g. p < 0.05 what do we do?

Reject the null hypothesis.

What would 5 sigma indicate?

The observed statistic is more than 5 standard deviations from the mean. For normal distribution, p < 0.0000003 (less than 1 in 3.5 million probability of seeing result by chance).

Statistics 2 Flashcards

Standard error, bootstrapping, correlation coefficient. (41 cards)