Khan Academy: Confidence Intervals Flashcards

Question 1

Q

How can we calculate confidence interval and margin of error for sample proportion problems?

Answer

A

Basically what we’re doing is that we assume the sample mean approximates the population mean
then we assume the sample SE approximates population SD
then we use this info to create our sampling distribution of sample mean (if it’s a binary distribution, sampling distribution of sample proportion)
then we calculate the confidence interval, using the confidence level. if it’s for example 90%, it means that we want to find within what range of means, 90% of means fall (we use z table for this) or from where to where, the area under the sampling distribution of sample mean= .9
________________________________________
.
Formula: P_sample ± SD_sample/√n
.
Note if we have the population proportion or population SD, we use them to create the sampling distribution of sample proportion: P_population±SD_population/√n

Question 2

Q

What is this equivalent to:
there’s 95% probability that sample proportion is within 2 SDs of population proportion

Answer

A

There’s a 95% probability that the population proportion is within 2 SDs of sample proportion

Ref

Question 3

Q

What’s the standard error of sample proportion?

Answer

A

Since we may not have the population proportion, we use a single sample proportion for calculating an approximate of population SD which is called Standard Error

Ref

Question 4

Q

A political pollster plans to ask a random sample of 500 voters whether or not they support the incumbent candidate. The pollster will take the results of the sample and construct a %90 confidence interval for the true proportion of all voters who support the candidate.
Which of the following is a correct interpretation of the %90 confidence level? Choose all answers that apply:

A If the pollster repeats this process and constructs 20 intervals from separate independent samples, we can expect about 18 of those intervals to contain the true proportion of voters who support the candidate.

B About %90 of people who support the candidate will respond to the poll.

C If the pollster repeats this process many times, then about %90 of the intervals produced will capture the true proportion of voters who support the candidate.

Answer

A

A and C
A The stated confidence level means that we can expect ≈90%of these intervals to contain the parameter of interest, and 18 of 20 is %90.

B Confidence levels don’t tell us the response rate of a poll.

C Confidence levels tell us the long-term rate at which a certain type of confidence interval will successfully capture the parameter of interest.

Question 5

Q

A baseball coach was curious about the true mean speed of fastball pitches in his league. The coach recorded the speed in kilometers per hour of each fastball in a random sample of 100 pitches and constructed a %95 confidence interval for the mean speed. The resulting interval was (110, 120). Is the below statement correct?
If the coach took another sample of 100 pitches, there’s a %95, percent chance the sample mean would be between 110 and 120 km/h

Answer

A

No, Confidence intervals give us plausible estimates for population parameters; they don’t make estimates about upcoming values of sample statistics.

Question 6

Q

How could we narrow the confidence interval but keep the confidence level?

Answer

A

confidence interval range depends on SD, and SD of sampling distribution of mean is SD population/ radical(n), so by increasing the sample size, we can narrow the confidence interval and keep the confidence level the same

Question 7

Q

What are the steps of finding the confidence interval of population mean, using a sample of size n?

Answer

A

1) calculate sample mean
2) calculate sample SD
3) assuming the sample is representative of the population, then mean of population=sample mean
SD of population= sample SD
4) use the estimated mean and SD of population to create the sampling distribution of mean curve, it has:
Mean of sampling distribution of mean= sample mean
SD of sampling distribution of mean= sample SD/√n

Question 8

Q

What are the conditions of a sample used for calculating the confidence interval? (For proportion)

Answer

A

1) Random sampling
2) Normal condition: meaning that np and n(1-p) must be over 10, OR we have more than 10 successes and 10 failures
3) Independence condition: either we should do sampling with replacement OR use the rule of 10% ( if the sample size is less than 10% of the population, then it’s OK to do sampling without replacement)

Breaking any of these rules, creates an inaccurate confidence interval
Ref

Question 9

Q

What is critical value of z and how can we calculate it for a given confidence interval?

Answer

A

Ref

When we have a confidence interval, we calculate the area under the curve for a part in the middle of the curve and the cut-off for the area is symmetrical, for example for 95% accuracy, we find the area under the curve for μ±2SD. The z-score here is symmetrical if SD=1, then z-score would be ±2. The critical value of z score is just 2.

Question 10

Q

What does 99% confidence interval mean?

Answer

A

If we repeatedly take samples of size n
And
Repeatedly use the confidence interval finding technique (mean population ≈ mean sample, SD population ≈ SD sample then creating the sampling distribution of the parameter, example: for Sampling distribution of mean, mean = Population mean ≈ sample mean and SD= Population SD/√n ≈ Sample SD/√n)
Then roughly %99 of the intervals contain the actual population parameter

Ref

Question 11

Q

Questions to test confidence interval knowledge

Question 12

Q

How can we calculate minimum of sample size, using confidence interval and margin of error?

Question 13

Q

What are the conditions for a valid t-interval?

Answer

A

1) Random sampling
2) Normal distribution of sampling distribution of the parameter ( which for continuous distributions would be
1) sample size over 30
OR
2) normal original distribution
OR
3) symmetrical sample)
3) Independence of individual observations OR 10% rule

Ref

Ref 2

Question 14

Q

Why using a t-table is preferred compared to z-table for calculating confidence interval?

Answer

A

Using z-table underestimates the confidence interval.

Ref

Question 15

Q

What is the relationship between degree of freedom and t distribution?

Answer

A

Degree of Freedom (DF) says you’ll have a different T distribution based on your sample size, DF= sample size-1

Question 16

Q

How can we calculate the critical t value?

Answer

A

If we have confidence interval and sample size, we can calculate the degree of freedom and use the t table to find the critical t value for the desired confidence interval.

Answer

Question 17

Q

To find confidence interval
If we have the sample mean and population SD, we use ____
If we have sample mean and sample SD, we use ____

Answer

A

Z-statistics
T-statistics

Ref

Question 18

Q

How is matched pair design of study?

Answer

A

For each member in the sample, you’d do control AND treatment

Ref

Question 19

Q

How is paired data obtained? 2 methods

Answer

A

1) 2 observations of the same person, e.g. pre-course/post-course test score

2) Making an observation on each of two similar individuals, e.g. pairing similar subjects to give one medicine and the other, placebo

Question 20

Q

In paired data we typically want to calculate the ____

Answer

A

Difference

Question 21

Q

When do we use t-statistics?

Answer

A

If the population variance (σ2 ) is unknown

Question 22

Q

When do we use z-statistics?

Answer

A

A z-test can only be used if the population standard deviation is known