Topic 5- Estimation And Confidence Intervals Flashcards
Estimator properties
Unbiased and efficient
Efficiency
Is when less dispersed. Less variance=more efficient.
More clustered is good.
How to improve efficiency
Increase sample size, as shown by standard deviation of sample mean formula (topic 4, last flash card)
σ/√n.
As we increase n (sample size), standard deviation falls.
Point estimate
The statistic, from sample information that estimates a population parameter.
E.g max temp tomorrow will be 15c.
Confidence intervals
A range of values that the population parameter is likely to occur within the range found, at a specified probability.
E.g max temperature will be between 13-17C.
CI formula
Point estimate +or- margin of error.
1.Confidence interval for a population mean with a population SD (σ) known.
- Confidence interval for a population mean with a population SD (σ) unknown.
- Sample mean (X bar) ± Zscore (σ/√N)
(σ/√N is the SD of the sample mean remember!)
- We use sample standard deviation S instead as a population SD is unknown and T STATISTIC NOT Z SCORE
Steps to calculate a confidence interval
Decide on confidence level (level of risk)
Find z-score for confidence level. (Divide confidence level equally to find z score.) (Z=x bar-μ/(σ/√n)
Calculate: sub values in CI equation. (Shown earlier)
Note: trade off with size of range and confidence level
Increasing confidence makes range wider so less informative.
E.g 95% confidence (alpha=0.05) = x bar+/- 1.96 x sigma/root n
99% confidence has z score 2.575= x bar +/- 2.575 x sigma/root n
+ or - 2.575 is a wider interval, but higher confidence it will lie in there.
What is alpha
Alpha is the probability of making an error
E.g if alpha= 0.05, it is saying 5% of times the population mean will not lie in the interval. 95% confidence rate
What happens when sigma is unknown
Use s (sample deviation instead), and USE T STATISTIC not z!
T statistic
Sample mean - μ
/
S/√N
CI for t distribution
Sample mean ± (T using alpha and n-1) x S/√N
T distribution features (4)
Mean=0
Standard deviations differ depending on sample size n.
N-1 =degrees of freedom (d.f)
Flatter, less clustered than standard normal.
T distribution table- left has degrees of freedom. E.g sample size of 10, we use 9 degrees of freedom (n-1)
Top has the level of significance.
1.What would the t value be for infinity df with 95%?
2.What would the t value be for infinity df with 99% confidence?
- 1.96 (like z value when alpha is 0.05/ 95% confidence)
- 2.576 (like z value with 99% confidence)
What happens as DF (degrees of freedom) increase
What happpens if N increases
Distribution gets narrower
What is a high medium and low confidence level
High=99%
Med=95%
Low=90%
Next part: Confidence intervals for a population proportion
A population proportion e.g 60% of McDonalds revenue is through drive through
Notations for population proportion vs sample proportion
Population proportion= PIE SYMBOL
Sample proportion= P
Sample proportion formula
P=x/n
X is number of successes, N is number sampled.
Confidence interval for a population proportion (P)
CI will be
P± Z x √P(1-P)/N
How to work out the required sample size to estimate a population mean
N=(Zσ/E)²
Z is standard normal value corresponding to the confidence level e.g 1.96 for 95%
E is maximum allowable error (range of confidence interval)
How to work out the required sample size to estimate a population proportion
N= π(1-π) x (Z/E)²
If we can’t find where PP may be, use 0.5
When do we use finite population correction factor
use when sample population is not large
FPC factor formula, and how do we apply it to confidence intervals to reduce margin of error?
√N-n/N-1
N is total population
n is sample size
Apply at the end of
X bar ± t x S/√N (CI for unknown sigma, so we used S instead of it!)
When to use Z and T scores
Z when σ known
T when unknown
This differs slightly in hypothesis where you still use Z if sample is large but σ is unknown!!