Confidence Intervals and Hypothesis Testing Flashcards
What is Hypothesis Testing
Setup hypothesis about population parameter that you want to test, gather sample as see if it confirms or rejects our assumptions.
Null hypothesis (H0)
Hypothesis Testing
Population parameter we assume to be true (want to confirm).
Example: Assume average population IQ is 100.
Ha / H1 - Alternative hypothesis
Hypothesis Testing
Unexpected result (that we want to check for significance).
Example: Population IQ is larger than 100.
Significance level
Hypothesis Testing
- Alpha (α)
- What percent in H0 is so unexpected that we reject H0 and assume Ha.
- Example, if our mean is in 5% of normally distributed sample assuming H0, it means that our H0 is wrong and we reject it.
Z-stat vs T-stat
Hypothesis Testing
- We use the z-distribution (z-statistics and z-tests) with proportions.
- And we use the t-distribution (t-statistics and t-tests) with means.
- With a large sample size, results from the t-distribution are very similar to results from the z-distribution. So sometimes statisticians use the z-distribution with means. However, we should avoid that practice (even for large samples).
- In general, when population mean is not known, use T-statistic.
P-value
Hypothesis Testing
Probability of obtaining test results at least as extreme as the result actually observed, under the assumption that the null hypothesis is correct.
General formula for Hypothesis Test
(BestEstimate - H0_estimate) / SE_h0
Type I Error
Hypothesis Testing
- We reject H0 although it’s true
- (we received sample data that falls under significance level but H0 is true)
False Positive
Type II Error
Hypothesis Testing
We don’t reject H0 although it’s not true.
Data we received didn’t fall under significance level, but it should because H0 is false.
False negative
Power
Hypothesis Testing
P(rejecting H0 | H0 false)
1 - P(Type II Error)
Probability of correctly rejecting H0 (i.e. accepting Ha) when H0 is false.
How To Increase Power
Hypothesis Testing
- Increase significance level (alfa). Big downside - P(Type I Error) goes up as well.
- Increase sample size (n)
In general, Power is higher when population:
* Has less variability.
* True (alternative) parameter (Ha) is far from H0 - less intersection area.
How to check normallity assumption of sample data
- Plot it and check the shape (if large sample size)
- Use QQ Plot to check for normality.
- Compare box plots to see if IQR are roughly the same.
If not sure then normality requirement can be ignored if n is large enough (because of CLT),
Welch’s t-test
Hypothesis Testing
Method of Testing for a Difference in Population Means of independent groups in Unpooled Approach
What is P-hacking and how to prevent it
Hypothesis Testing
- P-hacking polega na łamaniu założeń używanych modeli statystycznych, takich jak stosowanie niezależnych prób losowych, oraz na popełnianiu błędów logicznych.
- Metody przeciwdziałania takiemu zjawisku obejmują między innymi prerejestrację planów badawczych.
Why is it called Student’s t-test
Hypothesis Testing
Gets its name from William Sealy Gosset who first published it in English in 1908 in the scientific journal Biometrika using the pseudonym “Student” because his employer (Guinness Brewery) preferred staff to use pen names when publishing scientific papers (also, didn’t want to let competitors know they use t-statistics).