Khan Academy: Confidence Intervals Flashcards
How can we calculate confidence interval and margin of error for sample proportion problems?
Basically what we’re doing is that we assume the sample mean approximates the population mean
then we assume the sample SE approximates population SD
then we use this info to create our sampling distribution of sample mean (if it’s a binary distribution, sampling distribution of sample proportion)
then we calculate the confidence interval, using the confidence level. if it’s for example 90%, it means that we want to find within what range of means, 90% of means fall (we use z table for this) or from where to where, the area under the sampling distribution of sample mean= .9
________________________________________
.
Formula: Psample ± SDsample/√n
.
Note if we have the population proportion or population SD, we use them to create the sampling distribution of sample proportion: Ppopulation±SDpopulation/√n
What is this equivalent to:
there’s 95% probability that sample proportion is within 2 SDs of population proportion
There’s a 95% probability that the population proportion is within 2 SDs of sample proportion
What’s the standard error of sample proportion?
Since we may not have the population proportion, we use a single sample proportion for calculating an approximate of population SD which is called Standard Error
A political pollster plans to ask a random sample of 500 voters whether or not they support the incumbent candidate. The pollster will take the results of the sample and construct a %90 confidence interval for the true proportion of all voters who support the candidate.
Which of the following is a correct interpretation of the %90 confidence level? Choose all answers that apply:
A If the pollster repeats this process and constructs 20 intervals from separate independent samples, we can expect about 18 of those intervals to contain the true proportion of voters who support the candidate.
B About %90 of people who support the candidate will respond to the poll.
C If the pollster repeats this process many times, then about %90 of the intervals produced will capture the true proportion of voters who support the candidate.
A and C
A The stated confidence level means that we can expect ≈90%of these intervals to contain the parameter of interest, and 18 of 20 is %90.
B Confidence levels don’t tell us the response rate of a poll.
C Confidence levels tell us the long-term rate at which a certain type of confidence interval will successfully capture the parameter of interest.
A baseball coach was curious about the true mean speed of fastball pitches in his league. The coach recorded the speed in kilometers per hour of each fastball in a random sample of 100 pitches and constructed a %95 confidence interval for the mean speed. The resulting interval was (110, 120). Is the below statement correct?
If the coach took another sample of 100 pitches, there’s a %95, percent chance the sample mean would be between 110 and 120 km/h
No, Confidence intervals give us plausible estimates for population parameters; they don’t make estimates about upcoming values of sample statistics.
How could we narrow the confidence interval but keep the confidence level?
confidence interval range depends on SD, and SD of sampling distribution of mean is SD population/ radical(n), so by increasing the sample size, we can narrow the confidence interval and keep the confidence level the same
What are the steps of finding the confidence interval of population mean, using a sample of size n?
1) calculate sample mean
2) calculate sample SD
3) assuming the sample is representative of the population, then mean of population=sample mean
SD of population= sample SD
4) use the estimated mean and SD of population to create the sampling distribution of mean curve, it has:
Mean of sampling distribution of mean= sample mean
SD of sampling distribution of mean= sample SD/√n
What are the conditions of a sample used for calculating the confidence interval? (For proportion)
1) Random sampling
2) Normal condition: meaning that np and n(1-p) must be over 10, OR we have more than 10 successes and 10 failures
3) Independence condition: either we should do sampling with replacement OR use the rule of 10% ( if the sample size is less than 10% of the population, then it’s OK to do sampling without replacement)
Breaking any of these rules, creates an inaccurate confidence interval
Ref
What is critical value of z and how can we calculate it for a given confidence interval?
When we have a confidence interval, we calculate the area under the curve for a part in the middle of the curve and the cut-off for the area is symmetrical, for example for 95% accuracy, we find the area under the curve for μ±2SD. The z-score here is symmetrical if SD=1, then z-score would be ±2. The critical value of z score is just 2.
What does 99% confidence interval mean?
If we repeatedly take samples of size n
And
Repeatedly use the confidence interval finding technique (mean population ≈ mean sample, SD population ≈ SD sample then creating the sampling distribution of the parameter, example: for Sampling distribution of mean, mean = Population mean ≈ sample mean and SD= Population SD/√n ≈ Sample SD/√n)
Then roughly %99 of the intervals contain the actual population parameter
How can we calculate minimum of sample size, using confidence interval and margin of error?
What are the conditions for a valid t-interval?
Why using a t-table is preferred compared to z-table for calculating confidence interval?
Using z-table underestimates the confidence interval.
What is the relationship between degree of freedom and t distribution?
Degree of Freedom (DF) says you’ll have a different T distribution based on your sample size, DF= sample size-1