Statistical inference Flashcards

1
Q

Goal of statistics

A

To be able to make inferences about a population, but you can never truly know about an entire population

Instead, you must use a sample, and ensure that the sample and inferences are actually generalizable to the population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Sampling variation

A

Refers to the inevitable differences between our population and our sample.

We want to minimize this as much as possible in order to have a more accurate sample

Have to think about getting best possible estimate in face of sampling variation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Sampling bias

A

refers to systematic forces with regard to sampling
Ex. only using university students, or people who pick up the phone/agree to do a study over the phone

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Frequentist statistics

A

Frequentist statistics asks: how likely is it that this effect of the manipulation would have occurred if there was no effect in the actual population?
Ie. what is the chance this result was due to sample variation?

Is there another reason another reason this effect has arisen? Manipulation vs. random? What would happen if we did this study multiple times
Determining the answer to this question is the point of significance testing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is more common Frequentist vs Bayesian stats

A

Frequentist statistics is more common: t-tests, p-values, null hypothesis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Sampling distribution

A

generally focuses on the shape of the charted data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Central limit theorem

A

states that for a large enough sample size, the mean of the sample will be…

Normally distributed

Equal to the population mean

Have equal variance to the population variance (if divided by sq root of sample size)

This is true regardless of the distribution of the actual population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Standard error

A

the standard deviation of the sampling error

Will get smaller as the sample size gets bigger, therefore can be used as a measure of sampling error variation/precision

Essentially, standard deviation is a measure of the spread of the data, standard error is a measure of the variation of the sampling error

We cannot actually know the true population distribution, but we can use CLT and standard error to make inferences

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Confidence intervals:

A

Using CLT and standard error to infer if sample mean is true to population mean

Through sub-sampling, we are able to make inferences called confidence intervals

After getting the mean of multiple subsamples, you can see what percent of the means fall within one SD on either side of the mean

The confidence error tells us that 95% of repeated measures will fall within the SE of the sample mean

95% CI allows you to estimate that 95% of estimates lie within the specified SD

Bigger sample size equals smaller confidence intervals, because bigger sample size equals smaller SD, therefore the mean is more precise

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Null hypothesis (h0)

A

assumes that results obtained are the result of random chance/sampling variation, and not an effect of the manipulation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Alternative hypothesis (h1)

A

assumes the results obtained were due to an effect of the manipulation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

p-value and null hypothesis

A

P-value refers to the probability that a test-statistic this extreme would be obtained given the null is true/by random chance (conditional probability)

If the p-value/the probability of receiving this result by chance is very small, you can reject null hypothesis and assume the result is the effect of the manipulation
I.e., if the p is very small, H0 is probably wrong, so we should reject it

P-value is industry standard/arbitrarily set at 0.05 (if smaller or equal too, usually reject null, if bigger, accept null

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How to get p-value

A

favored tool for hypothesis testing

Essentially creates a null population distribution: what would we expect to see if this study was repeated and there is no real effect?

To get p-value, take standard error and calculate how many SE away this is the sample’s mean from the pop mean

Do this with z-score
Z-score = (Samp-mean-pop mean)/SE

The probability of observing a sample with our mean randomly is our p-value.

If smaller than 0.05, this can be considered statistically significant.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Type 1 error

A

False positive ie., reject the null despite there being no real effect

Alpha is the probability of a type 1 error

True positive is 1-beta
False positive is alpha

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Type 2 error

A

False negative ie., accept the null despite there being a real effect, can combat through increasing power

Beta is the probability of a type 2 error

True negative is 1-alpha
False negative is beta

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Alpha

A

Alpha is the probability of a type 1 error

17
Q

Beta

A

Beta is the probability of a type 2 error

18
Q

Definition of power

A

Your ability to find true positive results, also known as sensitivity.

Standard/arbitrarily set at 80%.

Property of a stat test, not a test itself

Low power is a waste of resource because high chance of false negatives, also have a higher chance of overinflated effect sizes (winners curse)

19
Q

Winners curse

A

small sample size have higher chance of random large effect sizes being flagged as significant)

20
Q

Power is dependent on…

A

sample size, stat test being utilized, sig threshold, and scale effect size you are looking for

While holding the other three factors constant…
One-sided tests have higher power
Larger samples have higher power
Larger effect sizes have higher power
Higher alpha values have higher power

21
Q

How to calculate sample size needed

A

We use power to calculate sample size needed, and can reduce alpha and beta with increased sample size

22
Q

One-sided test

A

when you do not care/are sure about the direction of the test. group mean > or < a given value

23
Q

Two sided test

A

when you are looking at which direction the effect goes in

When doing a two-sided test, the p-value under the curve dictates what is included in critical regions,

i.e., it dictates what values are above or below the significance.
Areas not included are rejection regions.

24
Q

T-test

A

Cannot do a z-test because we do not know true SD of population, must do t-test using the sample SD as an estimate

But this t-value doesn’t follow normal distribution, uses t-distribution which is slightly different instead

T-distribution changes depending on degrees of freedom, which is N-1
and accounts for the uncertainty that comes with a smaller sample size

The larger the sample size, the closer the t-distribution gets to normal distribution

25
Q

One sample t-test

A

compares the sample mean to the given mean value that is already known

26
Q

Two sample t-test or independent groups t-test

A

Compares means from two separate groups, between subjects and assumes…

Data normally distributed

Observations between and within groups are independent of one and other

Variance should be the same in both groups, but only in students’ t-tests, and we usually use the Welsch t-test, which does not assume this.

27
Q

Paired t-test

A

compares two means from the same sample, repeated measures/within subjects and assumes…

You are comparing between related groups or matched pairs/measuring the same individual twice

The difference scores should be normally distributed

T-test uses a difference score between variables

28
Q

Cohens d

A

measure of effect size between two means

d=0.2 is a small effect
d=0.5 is a medium effect
d=0.8 is a big effect

But these values are arbitrary and should not be interpreted rigidly

Interpret in the context of the study

29
Q

R: Power test

A

pwr.t.test(n= # of participants, d = cohens/effect size, sig.level = #, power = #, type = ‘one.sample’ or ‘two.sample’, alternative = ‘two.sided’)

30
Q

R: Effect size/cohens d

A

cohensD(group1, group2, method = ‘unequal’)

31
Q

R: QQ plot

A

qqnorm(dataset$column)

32
Q

R: Independent two-sided T.test

A

t.test(group1, group2, alternative = ‘two.sided’, var.equal = FALSE)

33
Q

R: Paired t-test

A

t.test(x= before, y = after, alternative = ‘greater’, paired = TRUE)

34
Q

R: get vector length

A

length(column)