- inventer of the p-value & null-hypothesis - experiment with lady tasting milk or tea first

- inventors of alternative hypothesis - null-hypothesis and alternative hypothesis combined in one paradigm with p-value > tricky to specify what an alternative hypothesis is

- reject null hypothesis when it is true - "false positive"

- not reject null hypothesis when it is false

L6 Ch5 Null Hypothesis Significance Testing Flashcards by Victoria Kubickova

Disclaimer

There are quite a few of repetitions in these flashcards, and the ones from the book are not integrated but are in a separate second part
I am very sorry about this but I don’t have time to make them look nice, so I hope that at least they are clear enough to study

How well did you know this?

Not at all

Perfectly

sampling distribution

distribution of means (usually)
how it would look if we were to repeat distributions over and over again
relates to null-hypothesis and alternative hypothesis (used to understand and interpret papers, in our case)

How well did you know this?

Not at all

Perfectly

Fischer

inventer of the p-value & null-hypothesis
experiment with lady tasting milk or tea first

How well did you know this?

Not at all

Perfectly

Neyman-Pearson

inventors of alternative hypothesis
null-hypothesis and alternative hypothesis combined in one paradigm with p-value
> tricky to specify what an alternative hypothesis is

How well did you know this?

Not at all

Perfectly

Standard Error

= variability in sampling distribution (variability that you can expect when repeating the experiment)
- SE high if lots of variability in variable
- SE low if high sample size
> high sample size → low variability → low SE

How well did you know this?

Not at all

Perfectly

Frequentist probability

considers p-value and sampling distribution
computes objective probability of an event
relative frequency (outcomes of event) in the long run (over same test done multiple times)

How well did you know this?

Not at all

Perfectly

how can confidence intervals be interpreted?

compute CI for 100 samples, and create sampling distribution for said samples
a CI of 95% means that 95 out of the 100 CIs for the samples will contain the population mean
~ single CI either contains the true value or it doesn’t
~ wider or narrower based on how certain we are of the inference
~ much better than using point estimate
(see picture 2)

How well did you know this?

Not at all

Perfectly

how can confidence intervals be calculated?

lowerbound: mean - 1.96 x SE
upperbound: mean + 1.96 x SE
(picture 1)

How well did you know this?

Not at all

Perfectly

What is the SE used for? How?

parameter estimation (for population)
> through confindence intervals (usually 95%)
~ higher SE → higher variability → broader CI (to reach 95% confidence)
~ lower SE → lower variability → narrower CI (to reach 95% confidence)

How well did you know this?

Not at all

Perfectly

how can SE be calculated?

standard deviation / square root of sample size

How well did you know this?

Not at all

Perfectly

sampling distributions under Ha

different than under H0
e.g. skewed

How well did you know this?

Not at all

Perfectly

“R”
- what is it?
- R vs Excel
- in exam

can be used as simple calculator
much more extensive than excel
open source (important as science should be open)
> primarly used as calculator (no extensive programming)
> data simulation

How well did you know this?

Not at all

Perfectly

Binomial sampling distribution under H0
- how to compute it in R

“if probability of heads is 0.5, what is the probability of getting 8/10 heads?”
- remember to run all the lines!
1. n <- 10 (sample size)
2. k <- 0:n (discrete probability space)
> this means that k is equal to the number 0 to n (10)
3. p <- .5 (probability of head)
4. coin <- 0:1
5. permutations <- factorial(n) / ( factorial(k) * factorial(n-k) )

“barplot” function → give values of probabilities to function, and it constructs the plot
(picture 3)
! look at WAs for representations of how R will be in the exam

How well did you know this?

Not at all

Perfectly

Type I error

reject null hypothesis when it is true
“false positive”

How well did you know this?

Not at all

Perfectly

what are the possible outcomes if we make a decision in frequentist framework?

(see picture 4)
- rows: do we (not) reject the H0?
- columns: is the H0 actually true/false?
- two squares per correct or incorrect decision (type I or type II error)

How well did you know this?

Not at all

Perfectly

Type II error

Study These Flashcards

not reject null hypothesis when it is false

how strict do we want to be when evaluating the H0?

Study These Flashcards

decide on alpha level (usually 0.05
if p-value is below alpha level, we reject null hypothesis
= type I error in 5% of the cases
!! alpha is type I error rate

effect sizes

Study These Flashcards

size of effect that we are looking for (e.g. size of correlation)
plays a role in how much power our statistical procedure has
standardized (divided by st.dev.)

how do we use the sampling distribution in regards to alpha?

Study These Flashcards

we mark areas in sampling distribution that constitute an extreme enough observation that makes us reject H0
with extreme observations we reject H0 (picture 5)

“power” of the analysis

Study These Flashcards

reject H0 when it is in fact false (correct decision)
power is conditional probability of rejecting H0 when false
it’s a function of sample size
(alpha is the conditional probability of H0 being true)

how are effect sizes and power related?

Study These Flashcards

the bigger the effect size, the more likely we are to have a higher power

what is the probability of not rejecting the H0 when it is true?

Study These Flashcards

1-alpha
“true negative”
(see picture 6)

what is the sum of the power and the true negative?

Study These Flashcards

1
they are conditional probabilities

Beta

Study These Flashcards

opposite of power
incorrectly decide to not reject the H0 when it is false
Type II error
“false negative”

how does changing the value of alpha affect the evaluation of the H0?

- lower alpha → harder to reject H0 → less type I error - higher alpha → easy to reject H0 → less type II error ! alpha used to establish critical region

how do we calculate "power" in a sampling distribution?

- power: rejecting H0 when not true - look at sampling distribution under Ha → there are many versions, but in the example of the dice, we could set the mean at 0.8 (see picture 7) - with new distribution, what is probability of rejecting H0 now? (what is our power?)

summary of this procedure

- we decide when to reject the H0, based on the sampling distribution under the H0 - look at sampling distribution + alpha level = reject H0? - then change to Ha distribution while keeping the "red regions" the same - now: what is the probability of rejecting H0 when Ha is true? (power) → power is the sum of probabilities in red !!! to compute power, we look at sampling distributions under the Ha

what is the interplay between alpha and power?

- if low alpha level, we are stricter when rejecting H0 → power decreases as well - balance between power and type I error rate

effect size vs Ha

- effect size: p of heads - increase effect size (further from 0.5) - more extreme values become more likely → more likely to reject null hypothesis ! conditioning on the effect size

what determines power level?

- alpha (lower alpha → lower power) - effect size (higher e.s. → higher power) > greater effect → more likely to reject H0 → more power

From the book

how can you distinguish a frequency plot from a histogram?

- frequency plot has small gaps between the columns

what is determined by the length of the whiskers?

- if whiskers have same length, then distribution is symmetrical - if the top of bottom whisker is much longer than the opposite, then the distribution is asymmetrical

how can you compare the relative frequencies of scores across groups?

Under frequency plots: - stack: shows the bars of each group stacked on top of each other - identity: displays each overlapping bars, with a certain level of transparecy - dodge: places the bars side by side within each bin (see picture 8 & 9)

Boxplot

- center: median - sides: interquartile range - violin element: includes density distribution of the data ~ using split variable, you can visualize the group difference (see picture 10)

how can we summarize the relationship between two variables?

- through a regression line - "correlation plots"

Raincloud plots

- display individual data points,boxplots and distribution of data points (see picture 11, 12 & 13)

Gigamega mastermind mind-map of plots

picture 14

L6 Ch5 Null Hypothesis Significance Testing Flashcards

(38 cards)