EAB - Estimation and Significance Tests and P Values Flashcards
What is sampling error?
Samples provide an incomplete picture of the population.
Different samples will give different estimates, which is called ‘sampling error’.
What is sampling distribution?
Sample estimates (e.g. means) are calculated from multiple samples from the same population.
They will will then have a distribution of differing values which is known as the ‘sampling distribution’.
What are two measures we can introduce to deal with uncertainty in drawing conclusions?
- Confidence interval:
If we are estimating some quantity from our data, for example, the proportion of patients who have a particular attribute, then we can quantify the imprecision in the estimate using a confidence interval. - Statistical significance test:
If we are testing a hypothesis, for example, comparing blood pressure in two groups, then we can do a statistical significance test which helps us to weigh the evidence that the sample difference we have observed is in fact a real difference.
What is the relationship between sample size and how close it is to the true mean?
The bigger the sample size, the closer the estimate is to the true mean.
What is the relationship between spread of data and how close it is to the true mean?
The smaller the spread of data (standard deviation), the closer the estimate is to the true mean.
What is a standard error?
A standard error (SE) is an indication of the extent of the sampling error.
Standard error tells us how much a sample mean tends to vary from the population mean (true mean). It provides an estimate of the precision of the sample mean.
How do you calculate standard error?
For a sample mean, it can be calculated from the standard deviation divided by the square root of the sample size.
(SE = SD / √[𝑁])
How can standard error be used to calculate a confidence interval?
The true (population) mean can be expected to lie in the range: (sample mean – 1.96 standard errors) to (sample mean + 1.96 standard errors) in 95% of calculations.
What are our assumptions when calculating a 95% confidence interval from population mean?
- this is normal data or a large sample (at least 60)
- the sample is chosen at random from the population
- the observations are independent of each other
What are our assumptions when calculating a 95% confidence interval from population proportion?
- the sample is chosen at random from the population
- the observations are independent of each other
- the proportion with the characteristic is not close to 0 or 1
- np and n(1-p) are each greater than 5 (large sample)
How do you calculate the standard error for proportion?
Multiply the proportion with the characteristic by the proportion without the characteristic:
p(1-p)
Divide by the sample size:
p(1-p)/n
Take the square root to deduce the SE:
√[(𝑝 × (1 − 𝑝)/𝑛)]
What is a significance test (and its benefit)?
A significance test uses data from a sample to show the likelihood that a hypothesis about a
population is true. There are always two mutually exclusive hypotheses since, if the hypothesis being tested is not true, then the opposite hypothesis must be true.
A measure of the evidence for or against the hypothesis is provided by a P value.
What is the null hypothesis?
The null hypothesis is the baseline hypothesis which is usually of the form ‘there is no difference’ or
‘there is no association’.
The corresponding alternative hypothesis is ‘there is a difference’ or ‘there is an association’.
What is a two-sided test (two-tailed test)?
It is known as a two-sided or two-tailed test when the alternative hypothesis is general and allows the difference to be in either
direction.
What is a one-sided test (one-tailed test)?
It is known as a one-sided or one-tailed test when the alternative hypothesis is not general and allows the difference to be in only one
direction.
Two-sided tests should always be used unless there is clear justification at the outset to use a one-sided test.
What are the steps in doing a significance test?
- Specify the hypothesis of interest as a null and alternative hypothesis.
- Decide what statistical test is appropriate.
- Use the test to calculate the P value.
- Weigh the evidence from the P value in favour of the null or alternative hypothesis.
Describe the types of errors in significance testing.
Since a significance test uses sample data to make inferences about populations, using the results from a sample may lead to wrong conclusion:
TYPE 1 ERROR:
this is getting a significant result in a sample when the null hypothesis is in fact true in the underlying population (‘false significant’ result).
We usually set a limit of 0.05 (5%) for the probability of a type 1 error which is equivalent to a 0.05 cut-off for statistical significance.
TYPE 2 ERROR:
this is getting a non-significant result in a sample when the null hypothesis is in fact
false in the underlying population (‘false non-significant’ result).
It is widely accepted that the probability of a type 2 error should be no more than 0.20 (20%).
Describe what a P value is.
A P value is a probability, and therefore lies between 0 and 1. It comes from a statistical test that is testing a particular null hypothesis.
It expresses the weight of evidence in favour of or against the stated null hypothesis.
Precise definition: P value is the probability, given that the null hypothesis is true, of obtaining data as extreme or more extreme than that observed.
What is the cut off point for a p value, and what does that indicate?
0.05 or 5% is commonly used as a cut-off, such that if the observed P is less than this (P<0.05) we consider that there is good evidence that the null hypothesis is not true. This is directly related to the type 1 error rate.
If 0.05 is the cut-off then P< 0.05 is commonly described as statistically significant and P≥0.05 is described as not statistically significant.
List some factors that affects the size of the p value.
- the size of the real effect in the population sampled
- the sample size
- the variability of the measure involved
What does clinical significance indicate?
This indicates that the difference observed is large enough to be clinically meaningful. It is not necessarily related to statistical significance as it is a clinical judgement and not a mathematical
quantity.