10.2: Confidence Intervals and t-distribution Flashcards
What is a point estimate? How is it computed? Provide an example.
Point estimates are single sample values used to estimate population parameters.
Computation:
mean = sum of single sample values/size of sample
The value generated is called the point estimate of the mean.
What is student’s t-distribution and when is it used? How does it compare to the normal distribution?
Student’s t-distribution is a bell-shaped probability distribution that is symmetrical about its mean. It is used when constructing confidence intervals based on small smalls (where n < 30) from populations with unknown variance and a normal distribution.
Compared to normal distribution, t-distribution is flatter with fatter tails.
What are the properties of student’s t-distribution?
- Symmetrical
- Defined by a single parameter, the degrees of freedom, which equals to n - 1
- Has fatter sales than the normal distribution.
- As the degrees of freedom (sample size) gets larger, the shape of the t-distribution approaches a normal distribution.
What happens to t-distribution when the degrees of freedom increases? What happens when degrees of freedom increases without bounds?
When degrees of freedom increases, the centre becomes more spiked and its tails become thinner.
When degrees of freedom increases without bounds, t-distribution converges to the standard normal distribution (z-distribution).
What is degrees of freedom?
Degrees of freedom is the number of observations, which is calculated as n - 1.
What are fat tails an indication of?
Fat tails mean that there are more outliers (observations away from the centre of the distribution).
How are confidence intervals for a random variable that follows a t-distribution related to degrees of freedom?
Confidence intervals for a random variable that follows a t-distribution must be wider when the degrees of freedom are less (fatter tails) for a given significance level, and narrower when the degrees of freedom are more (thinner tails) for a given significance level.
What is a confidence interval?
Confidence interval estimates result in a range of values within which the actual value of a parameter will lie, given the probability of 1 - alpha which is referred to as the degree of confidence.
What is alpha?
Alpha is the level of significance for confidence interval.
How are confidence intervals constructed?
CIs are constructed by adding or subtracting an appropriate value from the point estimate.
Point estimate plus minus (reliability factor x standard error)
How is the confidence interval for the population mean calculated, given that the population has a normal distribution with a known variance?
With known variance and normal distribution, CI is calculated as:
Point estimate for population mean plus minus reliability factor times standard deviation over the square root of sample size
What is the reliability factor for 90% CI?
What is the reliability factor for 95% CI?
What is the reliability factor for 99% CI?
Reliability factor for 90% CI = 1.645 (significance level is 10%, 5% in each tail)
Reliability factor for 95% CI = 1.960 (significance level is 5%, 2.5% in each tail)
Reliability factor for 99% CI = 2.575 (significance level is 1%, 0.5% in each tail)
How is the confidence interval for the population mean calculated, given that the population has a normal distribution with an unknown variance?
With unknown variance and normal distribution, CI is calculated as:
Point estimate for population mean plus minus t-reliability factor, corresponding to degrees of freedom 1 - n, times the standard deviation over the square root of the sample size.
How is the confidence interval created for a non-normal distribution?
If the sample size is less than 30 (n < 30), confidence intervals cannot be constructed.
If the sample size is greater than 20 (n > 30)
- variance is known, use z-statistic
- variance is unknown, use t-statistic
What are the two limitations to using a larger sample size?
- Larger sample sizes may contain observations from a different population, which can reduce the precision of population parameter estimates.
- Cost of using a larger sample should be weighed against the value of the increase in precision from the increase in sample size.