lecture 1 - effect size and power Flashcards

Question 1

Q

what is null hypothesis significance testing

Answer

A

a method of statistical inference by which an experimental factor is tested against a hypothesis of no effect or no relationship based on a given observation

NHST is a statistical method for testing whether there is enough evidence in a data sample to infer that a particular condition or effect exists in the larger population, its a way to decide between two competing hypothesis : null and alternative

Question 2

Q

what is
-null hypothesis
-alternative hypothesis

Answer

A

Null Hypothesis ( ): This is the default assumption or claim that there is no effect, difference, or relationship in the population. For example:
“There is no difference in mean scores between two groups’

Alternative Hypothesis ( ): This is the competing claim that there is an effect, difference, or relationship. For example:
“The mean score of group A is greater than that of group B.

Question 3

Q

what is the rationale for null hypothesis significance testing

Answer

A

Researcher has a research question

 Formulates a null hypothesis (there is no effect) and an alternative
hypothesis (there is an effect).

 Collects data (sample from population)

Question 4

Q

type 2 error

Answer

A

-there is a difference byt you fail to detect it

Question 5

Q

if the data :
provides
does not provide
evidence against the null hypothesis

Answer

A

If the data provide sufficient evidence against the null hypothesis:
◼ Rejects the null hypothesis
◼ Adopts the alternative hypothesis instead

 If the data does not provide sufficient evidence against the null hypothesis
◼ Rejects the alternative hypothesis
◼ But it does not necessarily mean that the null hypothesis is true.

Question 6

Q

problem with NH
-why NH is unrealistic in real world, impossibility for two groups to have the same score
-A null-hypothesis of H0: μa-μb=0 is a hypothetical construct

Answer

A

in the real world, it’s almost impossible for two groups to have exactly the same score. There will always be some tiny differences because of random chance or natural variation.

A non-significant result should never be interpreted as ‘no difference
between means’ or ‘no relationship between variables’.
If the test result isn’t significant, it doesn’t mean there’s absolutely no difference between the groups. It just means the difference is so small that, with the data we collected, we couldn’t be sure it wasn’t just random noise.

A non-significant result only tells us that the effect is not large enough to be detected with the given sample size. If we had a bigger sample (more data), we might be able to detect even small differences. A small sample might miss these subtle effects.

Question 7

Q

problems with NHST
-not possible to demonstrate the null hypothesis

Answer

A

Not possible to demonstrate the null hypothesis

a non-significant result could be due to the null-hypothesis being true OR a
failure to gather sufficient evidence
→ Researchers must set up their research so that the ‘desired’ outcome is to reject the null hypothesis

Question 8

Q

problems with NHST
-statistical significance is not practical significance

Answer

A

Statistical significance is not practical significance

 with a sufficiently large sample, very small effects can become statistically
significant, although they may be unimportant for any practical purpose.

Question 9

Q

practical significance : a fictitious example

Answer

A

 IQ is measured in >1000 participants
 Statistical tests indicate that one gender has a higher IQ than the other
(p<0.05).
 The actual difference in group means is 0.8 IQ points
 Although the difference is statistically significant, it is practically irrelevant:
it is not informative of the IQ of any individual person, because the variance
within groups is much larger than the difference between groups

Question 10

Q

problems with NHST

Answer

A

All-or-nothing thinking
 If p < .05 then an effect is significant, but if p > .05, it is not.
 One would reach completely opposite conclusions depending on whether p
= .0499 or p = .0501.
 However, these p-values only differ by 0.0002.
 They would reflect basically the same-sized effect.
→ Alpha level is arbitrary (result: many published papers with values
just below 0.05)

Question 11

Q

what does significant mean

Answer

A

In statistics, ‘significance’ implies that something is unlikely to have
occurred by chance (and may therefore have a systematic cause)

 What is considered to be ‘unlikely’ depends on an arbitrarily defined
significance threshold

 Psychology: α=0.05 (= a 1 in 20 chance)
 Physics: 5σ criterion (α=0.000000286), a 1 in 3.5 million chance
 A critical perspective: significance at a 5% threshold indicates limited
evidence that the data is not entirely random

Question 12

Q

what are alternative to NHST

Answer

A

-no clear replacement currently available
-proposed : effect size

Question 13

Q

effect size

Answer

A

Provides an estimate of the size of group differences or the effect of
treatment

 Ideally independent of the size of the sample

Effect size is a measure of the magnitude or strength of a difference or relationship in a study, beyond just whether it is statistically significant. While statistical significance tells us if an effect exists, effect size tells us how big or meaningful that effect is.

Question 14

Q

what are the uses of effect size

Answer

A

Measure of how large an effect is (p- or t- or F-value will not tell this)

-used in estimating the sample size needed for sufficient statistical power

-used when combining data across studied (meta analysis)

Question 15

Q

types of effect size

Answer

A

 Group difference indices (e.g., Cohen’s d)
 Strength of association (‘variance explained’, e.g., eta squared, R
squared)
 Risk estimates (e.g., relative risk)

Question 16

Q

effect size
-group differences

Answer

A

Examples:
 Males versus females
 Treatment versus control group
 Young versus older participants

Question 17

Q

difference between population mean and sample means

Answer

A

population mean is normally unknown, so sample mean can be used to get a good approximation

Question 18

Q

how to use sample mean to get effect size

Answer

A

sample means : m1-m2
eg effect size = 180-165 = 15

Question 19

Q

what is a disadvantage of using differnce in means for effect size

Answer

A

Disadvantage: Measure is dependent on measurement scale

Question 20

Q

standardised mean difference

Answer

A

sigma

-we dont know the population means, but we can use the sample means
-what about sigma? - Various methods to estimate sigma, leading
to different effect size measures

Question 21

Q

group difference indices

Answer

A

-cohens d
-glass’ delta
-hedge’s d

Measures differ on how the population variance is estimated from the data

Question 22

Q

cohens d

Answer

A

-most commonly reported

SDpooled

SDpooled = root of ..

Question 23

Q

hedge’s g

Answer

A

-very similar to cohens d
-Measures differ on how the population variance is estimated from the data

Question 24

Q

Glass’ delta

Answer

A

Glass’ delta uses the standard deviation from the control group rather than the pooled standard deviation from both groups.

 Glass’ delta is often used when several treatments are compared to
the control group.

Question 25

Q

paired samples t test

Answer

A

A paired samples
𝑡t-test (also called a dependent samples
𝑡t-test) is a statistical test used to compare the means of two related groups to see if there is a significant difference between them. The groups are “paired” because the same individuals or entities are measured twice under different conditions or at different times.

Question 26

Q

classification of effect size -cohens d

Answer

A

Classification of effect size:

 d between 0.2 and 0.49 = small
 d between 0.5 and 0.79 = medium
 d of 0.8 and higher = large

Question 27

Q