L8 - Statistical Power P1 Flashcards

1
Q

Statistical Significance Testing requires that we have a hypothesis about something we are interested in.

What does hypothesis refer to in this sense?

A

A hypothesis is an assertion about some population/property parameter that we are interested in in the world

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

At it’s core level, what is significance testing?

A

The traditional means of deciding whether to reject an hypothesis or not

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Why is significance testing necessary to accept/reject a hypothesis?

A

This is because we test our hypothesis by sampling from the population

We cannot test the entire population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

In significance testing, when would we reject the hypothesis?

A

If our sample statistic differs significantly from that specified by the hypothesis, we reject the hypothesis.

Otherwise, we continue to entertain the hypothesis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the type of hypothesis that we conventionally use for significance testing?

A

The Null Hypothesis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

The null hypothesis (H0) states that there is ___ effect

A

No effect

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

If the null hypothesis is rejected then the _______ hypothesis is supported

A

Alternative (H1) hypothesis

The alternative hypothesis is strictly NOT H0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How do you frame the null hypothesis so you can test it?

A

You identify a set of values you would expect if the null hypothesis were true.

You test the data against this distribution of values.

Once seeing your results, you can reject or retain the null hypothesis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Rejecting the null hypothesis means that your hypothesis is supported.

True or False

A

False

When you reject the null, you are supporting the alternative hypothesis.

The alternative hypothesis is strictly every possible hypothesis that is not the null.

Not what you happen to think is the way of the world.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the sampling distribution of the mean?

A

If you were to do your test about your hypothesis over and over, the test results all vary in some ways.

What you are left with in your results is a normal distribution of your samples.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are some of the features of the sampling distribution of the mean?

A

Looks like a normal distribution

Symmetrical

It’s mean is the estimate of the mean of the population

It’s SD is referred to as the “standard error of the mean”

We can know the proportions of values we would expect at any part of the curve.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How do we use sampling distribution of the mean to accept or reject the null hypothesis?

A

Because we know what the null should look like, if our mean sampling distribution looks like the null, then we retain the null.

Values that are close to the mean have high probability. Values at the ends of the distribution have low probability. If our values fall in the very low category, chances are it’s the alternative hypothesis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the critical region in NHST?

A

The cutoff for rejecting the null hypothesis

Typically 2.5% on each side (p = .05)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

A 5% level of significance means that if our data falls within that 5% critical region (2.5% on each side) our results are…

A

Statistically significant, we reject the null hypothesis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

A z-score of 1 means…

A

The results are 1 standard deviation away from the mean.

z-scores represent amount of standard deviation away from the mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

When setting up null-hypothesis significance testing, are we trying to prove our hypothesis?

A

No, we are comparing our results to what we would expect if there were no effect.

If the results differ, we reject the null. This does not mean our conclusions are accurate, only that there is an effect somewhere.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

How do we determine the “critical region” where we reject the null hypothesis?

A

The p value.

It’s arbitrary we decide before doing the study.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

In NHST there are 4 outcomes of whether we are right or wrong in our conclusions (2 right and 2 wrong).

What are they?

A
  1. We reject the null hypothesis when the null hypothesis is false, then we have drawn the correct conclusion.
  2. We fail to reject the null when the null hypothesis is true, then we have drawn the correct conclusion
  3. The null is true but we reject the null. Type 1 error
  4. The null is false, but we retain the null. Type 2 error
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Why are type 1 errors seen as worse than type 2 errors typically?

A

Type 1 errors mean that you believe there is an effect when there is none, so future researchers will be operating on false pretenses.

Type 2 errors are still bad, as we believe that there is no effect when there is one (e.g. trying to cure a disease, but we believe the treatment doesn’t work but it does)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What symbol do we use to denote the probability for a type one error?

A

alpha (α)

21
Q

What symbol do we use to denote the probability for a type two error?

A

β (beta)

22
Q

What is a type 1 error?

A

Rejecting a null hypothesis when it is really true

23
Q

What is a type 2 error?

A

Failing to reject a null hypothesis when it is really false

24
Q

Why can’t we make α tiny and reduce all type 1 errors?

A

Because there is a tradeoff, reducing α will reduce type 1 errors but increase the number of type 2 errors

25
Q

How is α (alpha) determined?

A

The α-level is determined prior to analysis and is usually p = .05 (5%)

26
Q

“Statistical power is the complement of making a type 2 error”

What is statistical power?

A

The probability that you will correctly reject a false null hypothesis.

27
Q

“Power” is the probability of correctly rejecting the ____ when the ____ is false. (same phrase)

A

Null hypothesis (1-β)

We conclude that there is an effect in the population, when that effect exists

28
Q

(1-β) = P(reject H0|H0 is false) is the formula for?

A

Statistical Power

29
Q

What 3 things determine statistical power?

A

Significance criterion we adopt (α)

Sample size (N)

Population effect size (ES)

30
Q

Why does power analysis have “3 degrees of freedom”?

A

Because if we know any 3 of these (power, alpha level, sample size, effect size) the fourth is determined for us.

This is what power analysis is about

31
Q

NHST stands for?

A

Null hypothesis significance testing

32
Q

If we know the numbers for 3 of the following, what is defined for us?

Significance criterion we adopt (α)

Sample size (N)

Population effect size (ES)

A

Power

33
Q

If we know any 3 of these, what can we say about the fourth?

power

alpha level

sample size

effect size

A

If we know any 3 of the points, the 4th is determined for us.

34
Q

According to this graph, what value will we have to see in order to reject the null hypothesis?

A

Over 5.96 or below 2.04.

The mean is 4, with an SD of 1. The lines are where the cutoff is.

35
Q

In this image, the top graph is the null hypothesis and the bottom is the alternative hypothsis.

Why is the type 2 error where it is?

A

Because if we get results that fall in that range before it reaches the p cutoff, we will believe that it is part of a null distribution, whereas the null and alternative overlap

Although there is an effect, we don’t believe there is because it’s not in the cutoff and don’t reject the null. Type 2 error.

36
Q

Do we assume we are sampling from the alternative distribution?

A

No.

We always assume we are sampling from the null. Only when our results are improbable enough do we assume there is an alternative hypothesis that we should draw from.

37
Q

If our type 2 error rate is p = .15. What would our statistical power be?

A

power = .85

38
Q

Describe what a type 2 error rate of p = .15 means?

A

15% of the time when we are sampling from the alternative hypothesis distribution, we will fail to reject the null hypothesis.

39
Q

The proportion of the possible answers generated by the alternative hypothesis that are within the “critical region” (cutoff area) is called…

A

Statistical Power.

Possible answers that are not within the cutoff region but are still drawing from the alternative hypothesis is the type 2 error rate

40
Q

How can we determine sample size a priori (before the test)?

A

Given ES (effect size), α, and a desired level of power - we can determine how big our sample size should be.

If we know 3, we can understand the 4th

41
Q

How can we determine power levels post hoc (after the test)?

A

If N (sample size), ES (effect size) and α are known, we can use power analysis to determine power for the study.

If we know 3, we can understand the 4th

42
Q

How would we determine the effect size level that we found using power analysis?

A

Given N, the desired power level, and α; we can determine what sort of effect could be reliably detected,

We know 3 things, we can determine the 4th

43
Q

Pearson’s correlation coefficient = r - what does it measure?

What is considered small, medium, large effect?

A

It measures the size of the effect

It measures the magnitude of the linear relationship between two continuous normally distributed variables

.1 = small; .3 = medium; .5 = large

These are just heuristics. Not hard categories

44
Q

Cohen’s d measures what?

A

The difference between the means of the control vs. treatment groups, divided by the pooled standard deviation of the two groups.

Small = .2; Med = .5; :Large = .8

45
Q

Partial Eta Squared and Eta squared - what are they?

A

Effect size measures, used in the analysis of variance.

They are proportions of variance that can be attributed to one of the effects in an analysis of variance.

46
Q

Which one of these is partial eta squared, and explain the formula

A

The third image

The partial eta squared for the effect we have designated A is equal to the sum of squared deviations of A divided by the sum of squared deviations of A + the pooled variability within each cell (sum of squares) that we attribute to subjects within the combination of A and B in the analysis of variance.

SS stands for sum of squares deviations

The A and B refers to 2 groups

47
Q

Name each of these 4 measures of effect sizes

A

Pearson’s correlation coefficient

Cohen’s d

Partial Eta Squared

F-squared

48
Q

If we are comparing two treatment groups, and we get a Cohen’s d of .8 - what does this mean?

A

80% of the results of the treatment group will be above the mean of the control group.