Week 5 - Statistical Power Analysis Flashcards

Learning Objectives: Understand Null Hypothesis significance testing Identify Type I and Type II errors Understand how alpha, sample size, and power are interrelated What are some methods that are used to reduce Type I error How is power associated with Type II error Advantages and disadvantages of Open Science and pre-registration What is an effect size? How are Confidence Intervals useful? How can we calculate sample sizes required for our study?

1
Q

what is Null Hypothesis Statistical Testing (NHST)

A
  • H0 (null hypothesis): The two groups (experimental vs. control) are not different on the dependent variable.
  • H1 (alternative hypothesis): The two groups are different on the dependent variable.

We design our study to determine whether we can “reject” the null hypothesis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what is type 1 error?

A

We reject the null hypothesis when it is in fact true (groups are really not different on the dependent variable but we think they are).

  • we might advocate for a treatment that doesn’t really work

“I falsely think the hypothesis is true” (one false)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

what is type 2 error?

A

We fail to reject the null hypothesis when it is in fact false (groups really are different on the dependent variable but we conclude they are not).

  • we might neglect a treatment that does work.

“I falsely think the hypothesis is false” (two falses)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what is a two group comparison?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what is the alpha level?

A

Alpha (α) is the probability of making a Type I error (rejecting the null hypothesis when we shouldn’t). We only want to do this rarely – 5 times out of 100 when the two groups are not really different is the usual cut-off.

by tradition, statistical significance has been determined at p < .05. The p value is the probability that you would find the current result if the two groups really weren’t different.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what is the credibility lab?

A

a way to preregister for your study

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

what is preregistration

A

design plan - items are date stamped so when you write your publication, theories were created before finding results

somewhat prevents harking

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what is the centre for open science

A

supports open science

mission to “increase the openness, integrity, and reproducibility of scientific research

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

what is the replication crisis?

A

when researchers find similar studies and replicate in order to

led to scientists engaging in more open science practice

the growing belief that the results of many scientific studies cannot be reproduced and are thus likely to be wrong.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

why do some adjust the alpha levels?

A
  • Researchers occasionally try to make the case for a higher alpha level (p < .10, for example). Or more often, they call these effects “marginal” or a trend. However, this is just increasing the likelihood that a Type I error has been made (treating a group difference as real when it is not).
  • P values do not tell us anything about the size or magnitude of an effect – they just tell us, dichotomously, about whether it is likely if the groups really weren’t different.
  • When we conduct a large number of tests, particularly exploratory tests, it is traditional to reduce your alpha levels to be more cautious about making a Type I error.
  • Bonferroni correction divides the alpha level (.05) by the number of tests. A priori theory based predictions are not as risky. Also see Benjamini-
    Hochberg procedure for controlling for False Discovery Rate
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is p-hacking?

A

exploitation of data analysis in order to discover patterns which would be presented as statistically significant, when in reality, there is no underlying effect.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

what is experimental power and the type 2 error?

A

Beta (β) is the probability of making a Type II error (declaring a real difference between groups not to be there).
Power = 1 – β

In other words, the more you can reduce the likelihood of missing a real difference, the more power you will have to find effects.

power is higher when effects are larger (easier to find) – when the real difference between groups is very large. Moreover, because any given sampled group is normally distributed around the population mean, having a larger sample ensures a more accurate assessment of that group and thus more power/confidence that we have found a real effect. Small effects require larger samples in order to see the real group difference beyond sampling error.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

what are effect sizes?

A

Effect size, the amount of something of interest An effect size (ES) can be:

▪ A mean, or difference between means
▪ A percentage, or percentage change
▪ A correlation (e.g., Pearson r)
▪ Proportion of variance explained (R2, w2, h2…)
▪ A standardised measure (Cohen’s d, Hedges g…)
▪ A regression slope (b or b)
▪ A measure of goodness of fit
▪ Many other things… (but NOT a p value!)
Kelley, K., & Preacher, K. J. (2012)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

what is cohen’s d?

A

one of the most common ways to measure effect size. An effect size is how large an effect is. For example, medication A has a larger effect than medication B

d = 0.2 be considered a ‘small’ effect size, 0.5 represents a ‘medium’ effect size and 0.8 a ‘large’ effect size. This means that if the difference between two groups’ means is less than 0.2 standard deviations, the difference is negligible, even if it is statistically significant.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

what is a 95% confidence interval (CI)

A

a range of values that you can be 95% certain contains the true mean of the population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

how to best report confidence intervals

A

APA Style recommends that confidence intervals be reported with brackets around the upper and lower limits: 95% CI [5.62, 8.31]

E.g., The mean difference between Conditions 1 and 2 was statistically significant at the specified .05 level, t(177) = 3.51, p < .001, d = 0.65, 95% CI [0.35, 0.95].

17
Q

advice from Stukas and Cumming (2014) on effect sizes

A

Routinely report all ESs, with their 95% CIs. Report ESs in original units, and/or a standardized form.
▪ Interpret ESs and their 95% CIs in the research context, considering size, and theoretical and practical importance.
▪ Use relevant prior research to provide context for ES and CI interpretation.
▪ Where possible, seek consistency over studies of paradigm and measures,
to assist interpretation and meta-analysis.
▪ If possible, use ES reference values to guide interpretation. If necessary, consider developing such values, but consider their breadth of applicability.
▪ ES and CI interpretation should consider the full research context, including participants, tasks, setting, and, in particular, the size and nature of the
experimental manipulations.

18
Q

what is a one way ANOVA

A

compares the means of two or more independent groups in order to determine whether there is statistical evidence that the associated population means are significantly different

Use a one-way ANOVA when you have collected data about one categorical independent variable and one quantitative dependent variable. The independent variable should have at least three levels (i.e. at least three different groups or categories).

19
Q

what is an F test?

A

F = MS effect / MS error

indicates the proportion of the variance due to the IV or interaction as compared to the proportion of the variance due to error

20
Q

what is eta-squared?

A

eta2= ss effect / ss total

useful as calculation of effect size in sample, but biased as estimate of population effect size

omega-squared can be calculated instead (more complicated, but ‘corrected’)

SPSS provides partial eta2 (np2) which is SS effect / SS effect + SS error, which is different

21
Q

how is effect size for f-ratios used in analysis of variance?

A

The effect size used in analysis of variance is defined by the ratio of population standard deviations.

Although Cohen’s f is defined as above it is usually computed by taking the square root of f2.

22
Q

what is g power?

A

a tool to compute statistical power analyses for many different t tests, F tests, χ2 tests, z tests and some exact tests

23
Q

how to understand statistical power and significance testing

A