Week 5 Flashcards
Point estimate
- Sample mean is a point estimate
- Represents a very precise statement
- Not sure how accurate it is – due to sampling error
Confidence intervals
- sample means vary in a predictable way, we can estimate the likelihood of the population mean being within a certain range
- to work this out, we need to have an idea of
- > the centre of the distribution
- > the population mean
- > the spread of the distribution (which for confidence intervals is the standard error)
Standard error
the standard deviation divided by the square root of the number of observations.
Representative samples
- In a normally distributed population 95% of scores are within 2 SD of the mean
- In a sampling distribution of the mean 95% of sample means are within 2 standard errors of the population mean
- 68 / 95 / 99.7
- must be representative
- must be at least 30 datums
Single sample logic
- what is the range of likely values of the population mean
- 95% of samples are within 2 standard errors of the population mean
- therefore 95% chance that my sample mean is within 2 standard errors of the actual population mean
- This means, there is a 95% probability that the population mean is between two points
- > x̄ -2σ(x̄) & x̄ +2σ(x̄)
- Because of all of the potential samples means we could get from a population [] - []95% will be in this range, and we call this the 95% confidence interval.
95% CI
- The 95% Confidence Interval (CI95) is a range of scores, centred on a sample mean, within which the population mean occurs 95 times out of 100.
- On average the 95% CI does not include the population mean 5 times out of 100
CI formulas
95% Confidence Interval = CI95 = x̄ ± 2σ(x̄)
• By the same logic:
• 68% Confidence Interval = CI68 = x̄ ± 1σ(x̄)
• 99.7% Confidence Interval = CI99.7 = x̄ ± 3σ(x̄)
• CI(p) = x̄ ± zσ(x̄)
• Where:
• p = probability you will include the population mean
• z = “z critical” = z score that borders the middle p % of scores in the standard normal distribution
Margin of error
z score multiplied by standard error
Upper bound
- mean plus the margin of error
- the highest value of the range
Lower bound
- mean minus the margin of error
- the lowest value of the range
Reporting CIs
- mean +/- the margin of error
- lower boundary and upper boundary
CI influence on z scores
- As Confidence Level increases precision of estimate decreases (interval gets wider)
- As C increases value of z* increases
- > probability of accuracy increases as range increases
CI influence on standard deviation
- As variation in the population goes down precision increases (interval gets narrower)
- the more similar people are, the better we can predict the range of the mean
CI influence on n
- As sample size increases precision increases (interval gets narrower)
- more people allows more accurate intervals
T scores
- If you have an infinite number of scores - the t-distribution is the z-distribution
- as you change sample size, as you decrease your degrees of freedom the distribution get’s flatter and wider. - to calculate t, you need to know what your degrees of freedom are.
- a t-value of 2.5 SD, how extreme that value is in the t-distribution depends on the degrees of freedom. A value of 2.5 is quite extreme with an n of 60, there is only a small area under the curve to the right of the value. But with an n of 5, we can be less confident of capturing the mean - so our value of 2.5 is a less extreme estimate as there’s a larger area left over under the curve.