05 Confidence Intervals Flashcards
What is a point estimate?
A single value used to estimate a population parameter, such as a sample mean x̄ used to estimate μ, or a sample proportion p̂ used to estimate p
What are the best estimates of μ and σ?
x̄ is the best estimate of μ, and p̂ is the best estimate of p
What is a confidence interval?
A range of values that the true population parameter (μ or p) is likely to lie in
What is a confidence level, C?
The probability that the population parameter lies within the confidence interval
What is a margin of error, ME?
Statistic ± ME = the confidence interval for the parameter
How do we calculate ME?
- ME = (critical value)⋅(standard deviation of statistic)
- Critical values are found from the t-distribution table
- Formulæ for the standard deviations of the statistics are on the formula sheet
All the methods for calculating confidence intervals are only valid if…
the samples are random
How large is a large sample?
n ≥ 30
When can we use the normal distribution z-scores as critical values in confidence intervals for means?
If the population is normal or the sample is large
We don’t usually know σ, so….
We usually use s to estimate it
What are the steps to calculating a confidence interval?
- Write down the relevant values from the question
- Verify that the relevant conditions and assumptions are met
- Find the critical value from the tables
- Calculate the ME and confidence interval
- Write down what it means in the context of the question you are answering
How do you find the zc value if C is not on the bottom of the t-distribution table?
Sketch the normal distribution curve, and use the tables to find zc such that C of the data lies between ±zc
How do you find critical values for a small sample?
Use the t-distribution
How do you look up values on the t-distribution table?
By looking for degrees of freedom, df
What are degrees of freedom, df?
- The number of independent observations
- df = n-1
What happens to the t-distribution as df increases?
- the variability of the curve falls (it gets narrower in shape)
- its shape approaches the shape of the normal distribution
- when df = 29, the difference between the two is very small, so when n ≥ 30 we can use the normal distribution
What are the conditions/assumptions to verify to calculate confidence intervals for proportions?
- the sample is random
- N > 10n
- np > 10, nq > 10
What are the conditions/assumptions to be verified to work out confidence intervals for the difference between two means?
- the samples must be random
- the samples must be independent
- the population must be normal or n1 + n2 ≥ 40
How should you decide whether to use the t-distribution or normal distribution to calculate a confidence interval for the difference between two means?
- df is the smaller of n1-1 and n2-1
- if df ≤ 29, use the t-distribution
- if df > 29, use the normal distribution
What are the condtions/assumptions to be verified to calculate the difference between two proportions?
- the samples are random
- the samples are independent
- N > 10n for both samples
- np̂ > 10 and nq̂ > 10 for both samples
Why is there a minimum sample size n necessary to get a usable confidence interval for any given confidence level C?
Confidence intervals which are too wide are useless, and the size of the interval narrows as n increases. The maximum useful confidence interval width therefore corresponds to a minimum necessary sample size n.
How can you calculate the minimum necessary sample size?
Rearrange the expression given for ME on the formula sheet to give an expression for n in terms of zc, ME, and σ, and then solve for n.
When calculating the minimum sample size n necessary, we do not know σ. What are the three common ways to estimate it?
- use values from previous studies
- use a conservative estimate of √(pq/n) with p = 0.5
- use the rule of thumb σ = R/4, where R = the range of values we expect to find in the population
What must be remembered when calculating the minimum sample size?
Always round a decimal answer up. It is a minimum, so rounding it down will give a wrong answer