Hypothesis Testing Flashcards
To have a clear understanding of: - Sample and population - Hypothesis testing - P-values and confidence intervals - Testing a hypothesis referring to a single mean and a single proportion - Interpretation of statistical output for each test
Who is included in a population?
Current members and future members.
Why is a sample of a population used?
Because it is not feasible to collect data on the entire population, a sample is used to tell us about the theoretical population.
What is needed before collecting data from a sample?
The theoretical population must be defined in order to understand how generalisable the inferences from the sample are. The sample must be representative of the population.
What are the key considerations for a random sample?
- Each individual from the population must have an equal chance of being included.
- The inclusion of one individual must not affect the inclusion of another.
What is stratified random sampling?
A method in which populations are broken down into smaller divisions and samples taken from these in order to ensure the sample is representative of the entire population.
Why is the normal distribution important in medical research?
- To understand the central tendency and variability of the data to allow for summarising and interpretation of large datasets.
- For use of parametric statistical tests such as t-tests and ANOVA.
- For predictive modelling and risk assessment.
- For quality control in order to set limits and identify outliers.
What is the standard error?
A measure of the variability between sample means.
It quantifies the difference between the mean measured from a sample and the mean measured from the theoretical population.
What is the standard deviation?
A measure of how far the mean is from other points in the dataset.
It is the variability in the population/sample mean.
What does the null hypothesis state?
That there is no difference between populations.
What does the alternate hypothesis state?
That there is a difference between populations.
What do you have to assume about a sample in order to test a hypothesis?
- The sample is representative.
- The sample is independent.
- The sample has homogenous variance.
- The sample is normally distributed.
What can you assume if the sample size is > 30?
That the distribution of the sample mean is approximately normal (no matter the distribution of the data).
When should you use the normal distribution?
If the sample size is > 30 and assumptions are met.
When should you use the t-distribution?
If the sample size < 30 and assumptions are met.
What should you do if assumptions are not met?
First, try to transform the data.
If the transformed data still does not meet the assumptions, use a non-parametric test.