RESS I: Data Analysis #2 Flashcards
What is a sample used for?
To make statistical inferences about the population from which it was drawn.
What is statistical inference?
Statistical inference is the process of using the value of a sample statistic to make an informed guess about the value of a population parameter.
What is a parameter?
A particular characteristic of the population that we are interested in e.g. the population mean or proportion, mean difference, differences in proportions.
What is an estimation in statistical inference?
The process of using summary statistics from collected sample data to represent the population.
What is hypothesis testing?
Making a hypothesis about a population and then collect sample data to see whether it gives evidence against the hypothesis.
What is standard error?
In order to estimate the precision of the sample mean, the most obvious option available to us would be to repeatedly measure fresh samples from that population.
We could then calculate the spread of means generated by repeated sets of measurements and calculate the standard deviation of these means as a measure of their spread.
The standard deviation of these different estimates from repeated samples is known as the standard error.
Standard error of the estimate represents the average distance between an estimate and its population parameter.
How do you calculate standard error?
SE = standard deviation / square root of n
Where n is the number of samples.
From this we can deduce, the larger the sample size, the smaller the standard error.
What is the difference between precision and accuracy?
Measurements that are close to the known value are said to be accurate, whereas measurements that are close to each other are said to be precise.
What can standard error also tell us?
The standard error can also be used to estimate the range of estimates (in general) that are most likely to occur. This is because approximately 95% of all sample means will fall between our sample mean value +/– (1.96 x standard error mean).
Where does 99% of the data lie in a normal distribution?
Within 3 standard deviations of the mean (and 2 standard errors of the mean).
Where does 95% of the data lie in a normal distribution?
Within 2 standard deviation of the mean (and 3 standard errors of the mean).
What is the 95% Confidence interval?
A 95% confidence interval is a range of values that you can be 95% certain contains the true mean of the population. This is not the same as a range that contains 95% of the values. This can be calculated by using the idea that 95% of all sample means will fall between our sample mean value + or – 1.96 x standard error mean (SEM).
What is the framework for a hypothesis test?
- State the null and alternative hypothesis
- Decide a level of significance (p-value cut-off)
- Define and evaluate a test statistic
- Calculate the p-value
- Interpret the results
What is the null hypothesis usually in the form of us?
The null hypothesis is usually of the form:
- there is no difference between the two groups - or there is no association between treatment and outcome
What is the null hypothesis usually in the form of us?
The alternate hypothesis would be:
- there is a difference between the two groups - or there is an association between treatment and outcome