Hypothesis Testing Flashcards
How are outliers detected?
A minimum and maximum are created by calculating Q1 (25th percentile) and Q3 (75th percentile) and then finding the interquartile range.
Minimum is calculated with the following formula: Q1 - 1.5*IQR. Anything below this number is an outlier.
Maximum is calculated with the following formula: Q3 + 1.5* IQR. Anything above this number is an outlier.
What are the steps to hypothesis testing?
Step 1: Form hypothesis
Step 2: Collect the data, calculate mean and standard error of the sample
Step 3: Compare the sample mean to the hypothesis mean (using statistics)
Step 4: Make a conclusion using a p-value of 0.05 as the cut off.
What should be expected if a hypothesis is true or false?
True: Sample mean close to hypothesis mean
False: Sample mean far away from hypothesis mean
What would p-value tell us about a sample mean?
Tells us the probability that the sample mean is different to the mean value just due to random variation in the population.
If true population is 27 and sample mean is 28 then p-value would tell us probability of 28 or higher.
If sample mean is 26 then p-value would tell us probability of having a sample mean of 26 or smaller.
What is a type 1 error?
Despite something being statistically significant. (eg p-value of 0.03) there is still a chance that the conclusion is incorrect (3% chance if p = 0.03) Errors of this kind are type 1 errors.
I.e probability of rejecting the null hypothesis when it is actually true
What is a type 2 error?
Failing to reject the null hypothesis when it is actually false
What are the types of tests used for hypothesis testing?
Independent samples t-test
Paired t-test
Mann-Whitney U test
Wilcoxon signed-rank test
Chi-square test (and Fisher’s exact test)
Which tests are often used in normal distributions and which are used in non-normal distributions?
Normal:
Independent and paired t-tests. (parametric)
Non-normal (non-parametric):
Mann-Whitney U test
Wilcoxon signed-rank test
Chi-square test
What is an independent sample t-test used for?
2 independent categorical groups.
A continuous outcome.
Observations are independent (no overlap in groups eg male vs female groups)
Normally distributed outcome (approximately)
No massive outliers.
What is a paired sample t-test used for?
2 related groups (eg twins before/after)
A continuous outcome (age, Hb, etc)
Observations are independent within groups
Outcome is normally distributed
No massive outliers
When is Mann-Whitney U test done?
It is a non-parametric test
2 independent categorical groups (Male vs female for example)
A continuous outcome
Observations are independent
Normal distribution is not assumed.
Very similar to independent t-test except we don’t assume a normal distribution
When is a Wilcoxin signed-rank test used?
2 related groups
A continuous outcome
Observations are independent within groups
Does not assume normal distribution.
same as paired t-test but does not assume normal distribution
What is a chi-squared test used for?
To assess distribution of a categorical variable between 2 or more groups.
When is a chi-squared test?
It is non-parametric
Assumes expected cell frequency is at least 5, in each cell (if this assumption is not valid we can use the Fisher’s exact test)
Null hypothesis is that the distribution of observations between columns is independent of the rows.
A study tests patients’BP before and after taking a magnesium supplement. Patients are independent of each other. Outcome is found to be normally distributed. We wish to test if magnesium has a significant effect on BP. Which test should be used?
Paired t-test