Chapter 18: Sampling Distribution Models Flashcards

Question 1

Q

Define ‘Sampling distribution model’.

Answer

A

Different random samples give different values for a statistic. The sampling distribution model shows the behaviour of the satistic over all the possible samples for the same size n.

Question 2

Q

Define ‘Sampling variability (sampling error)’.

Answer

A

The variability we expect to see from one random sample to another. It is sometimes called sampling error, but sampling variability is the better term.

Question 3

Q

Define ‘Sampling distribution model for a proportion’.

Answer

A

If assumptions of independence and random sampling are met, and we expect at least 10 successes and 10 failures, then the sampling distribution is modelled by a Normal model with a mean equal to the true proportion value, p, and a standard deviation equal to
sqrt( pq/n ).

Question 4

Q

Define ‘Sampling distribution model for a mean’.

Answer

A

If assumptions of independence and random sampling are met, and the sample size is large enough, the sampling distribution of the sample mean is modelled by a Normal model with a mean equal to the population mean, μ, and a standard deviation equal to
σ / sqrt(n).

Question 5

Q

Define ‘Central Limit Theorem (CLT)’.

Answer

A

States that the sampling distribution model of the sample mean (and proportion) from a random sample is approximately Normal for large n, regardless of the distribution of the population, as long as the observations are independent.

Question 6

Q

Usually the mean of a sampling distribution is the value of…?

Answer

A

The parameter estimated. I.e, for the sampling distribution of p̂ the mean is p and for the sampling distribution of ȳ the mean is μ.

Question 7

Q

The sampling distribution of the mean is Normal, no matter what the underlying distribution of the data is, however…?

Answer

A

The CLT says that this happens in the limit, as the sample size grows. The Normal model applies sooner when sampling from a unimodal, symmetric population and more gradually when the population is very non-Normal.

Question 8

Q

Populations with a true proportion, p, close to 0 or 1 can be a problem. What happens in these cases.

Answer

A

When p is close to 0 the distribution is skewed to the right and when it is close to 1, it is skewed to the left.

Question 9

Q

What are the two assumptions needed to use the Normal sampling distribution model for a sample proportion?

Answer

A

The Independence Assumption (Randomization Condition)
The Sample Size Assumption (10% Condition - sample size must be no larger than 10% of population, and Success/Failure Condition - at least 10 successes and 10 failures)

Question 10

Q

The Success/Failure Condition wants sufficient data. How much depends on p. Explain.

Answer

A

If p is near 0.5, we need a sample of only 20 or so. If p is only 0.01, however, we’d need a sample of 1000. How about if p was 0.99?

Question 11

Q

What are the two assumptions needed to use the Normal sampling distribution model for a sample mean(CLT)?

Answer

A

The Independence Assumption (Randomization Condition)
The Sample Size Assumption (Large Enough Sample Condition - does not say, likely dependent mostly on data distribution, i.e. skewed vs. normal)

Question 12

Q

When we have categorical data, do we utilize sample mean or proportion? For quantitative data?

Answer

A

Categorical = Proportion
Quantitative = Mean

Question 13

Q

Sampling distributions arise because …?

Answer

A

Samples vary.

Question 14

Q

The CLT quantifies…?

Answer

A

Sampling error.

Question 15

Q

The denominator in the variability of sample means shows that it decreases as the sample size increases. What’s the catch?

Answer

A

The standard deviation decreases of the sampling distribution declines only with the square root of the sample size and not, for example, with 1/n. This limits how much we can make a sample tell about the population.

Question 16

Q

What happens when we sample more than 10% of the population for sampling distributions of the mean?

Answer

A

Mean = The SD formula overestimates the true SD.

For the Proportion?