EAB - Estimation And Significance Tests And P-Values Flashcards
what is a pro of using a sample?
Can use a sample to obtain a confidence interval
what is a con of using a sample
Cannot deduce exact population value from a sample
what is a confidence interval
Range within which population value is likely to be
what is sampling error
when different samples give different estimates
what is sampling distribution
Sample estimates (e.g.: means),calculated from multiple samples from the same population, will then have a distribution of differing values that is known as the ‘sampling distribution’.
Which sample mean gives the most precise estimate of population mean?
A: Random sample of 50 men, standard deviation 10
B: Random sample of 1000 men, standard deviation 10
B
THE MORE DATA = MORE PRECISE THE ESTIMATE
Which sample mean gives the most precise
estimate of population mean?
A: Random sample of 200 men, standard deviation 20
B: Random sample of 200 men, standard deviation 5
B
LOWER STANDARD DEVIATION = MORE PRECISE ESTIMATE
What is standard error?
A standard error (SE) is an indication of the extent of the sampling error
How can Standard Error be calculated
standard deviation divided by the square root of the sample size
SE= SD /sqrt𝑁
what does standard error tell us
Standard error tells us how much a sample mean tends to vary from the population mean (true mean).
It provides an estimate of the precision of the sample mean.
what does a smaller SE mean
what does a larger SE mean
What range can the true population estimate be expected to lie in?
- True (population) estimate can be expected to lie in the range:
- sample mean – 1.96 standard errors to
- sample mean + 1.96 standard errors in 95% of calculations
What are the 4 assumptions in calculating confidence interval?
- Normal data or large sample
- The sample is chosen at random from the population
- The observations are independent of each other
- The sample is not small (at least 60)
How do you calculate proportion?
It is the same formula of sample mean -/+ 1.96 standard error.
What are the 4 assumptions of proportion?
- the sample is chosen at random from the population
- the observations are independent of each other
- the proportion with the characteristic is not close to 0 or 1
- np and n(1-p) are each greater than 5 (large sample)
What is the standard error for proportion?
- Multiply the proportion with the characteristic by the proportion without the characteristic
- p(1-p)
- Divide by the sample size
- p(1-p)/n
- Take the square root to deduce the SE
- sqrt(𝑝×(1−𝑝)/𝑛)
- Worked through example.
- Out of 7074 men, 1981 smoked cigarettes.
- Proportion smoking = 0.28 (28%)
- Standard error of proportion = 0.0053
- 95% confident that true proportion is in the range:
0.28 - 1.96 x 0.0053 to 0.28 + 1.96 x 0.0053 - i.e. in the range 27% to 29%
what is the difference in the 95% confidence interval between a large sample and proprotion
- for the mean from a large sample the 95% confidence interval is:
- sample mean -1.96 standard errors
to
sample mean + 1.96 standard errors
- sample mean -1.96 standard errors
- for a proportion the 95% confidence interval is:
- sample proportion -1.96 standard errors
to
sample proportion + 1.96 standard errors
- sample proportion -1.96 standard errors
what decreases standard error?
as sample size increases, standard error decreases
what is the null hypothesis
The NH states that “No relationship exists between the variables and outcomes of the a study”
we then ask… does the sample data provide sufficient evidence to REJECT the null hypothesis
what is a p-value
provides a way of weighing evidence against the null hypothesis
p value is a probability that lies between 0 and 1
The smaller the p-value, the stronger the evidence against the Null Hypothesis
what does it mean if the p value is LESS THAN 0.05 (p<0.05)
^ provides good evidence to REJECT null hypothesis,
therefore a real difference or association DOES exist
and the result is STATISTICALLY SIGNIFICANT
what does it mean if the p value is MORE THAN or EQUAL TO 0.05 (p>0.05)
^ provides INSUFFICIENT EVIDENCE to reject null hypothesis,
therefore a real difference or association DOES NOT exist,
and the result is NOT STATISTICALLY SIGNIFICANT
What is clinical signficance
the difference observed is large enough to be clinically meaningful
in general, why are 2 sided tests used and not 1 sided tests
1 sided tests do not distinguish between ‘no effect’ and ‘a harmful effect’
does an increase or decrease in the SAMPLE SIZE bring it closer to the mean?
increase sample size = estimate closer to mean
does an increase or decrease in the SPREAD OF DATA bring it closer to the mean?
decrease spread of data = estimate closer to mean
decreasing the SD does what?
lower spread of data
increasing the SD does what?
higher spread of data