Week 4 Conclusion: Estimating a Population p Flashcards
Statistical Inference: ____
NOTE: sometimes statistical inference isn’t appropriate. For example ____.
Sampling variability estimates ____
Sampling distribution describes ____
Sampling distribution in practice: instead of ____, we ____ and ____.
Note: ____ must be ____ ()
Even though ____.
Statistical Inference
An uncertain conclusion about a population parameter based on statistics.
Sampling variability estimates how close the statistic is to the true population parameter.
Sampling distribution describes sampling variability.
Sampling distribution in practice
In real life, instead of taking many SRSs from the population, we take measurements on one sample of size n and bootstrapping (resample from the sample).
Note: the original sample must be sampled in an representative way (SRS)
Even if we sample in a representative way (SRS) in the population, there are still chances that the sample is not representative of the population (measurement error)
Bootstrapping
A strategy that ____.
Step 1: Draw many ____ of size ____ with ____ from the original sample.
→ Sampling with ____ to ensure
STEP 2: ____
STEP 3: ____ to get bootstrap sampling distribution
Note: ____.
Empirical sampling distribution vs bootstrap sampling distribution
A strategy that estimates sampling distribution using just information from one sample of the population.
STEP 1: Draw many bootstrap samples of size n (same size as the original sample) with replacement from the original sample.
→ Sampling with replacement to ensure variability in the statistics.
Bootstrap sample: sample resampled from the original sample.
Sampling with replacement: selecting items from a dataset where each selected item is returned before the next selection, allowing duplicates.
STEP 2: Calculate the statistic for each bootstrap sample.
STEP 3: Summarize the values of the statistic for all bootstrap
samples to get bootstrap sampling distribution
Note: Bootstrapping does not create new data nor increase your sample size, so it does not provide a closer estimation of the population parameter than point estimation of the original sample (i.e. it does not reflect the bias or measurement error if any exists in the original population)
But it can reflect the sampling variability in the statistic.
Empirical sampling distribution vs bootstrap sampling distribution
Estimates population p by bootstrap confidence interval (CI)
Confidence level: ____
Interpreting the confidence level
A. we are __% confidence that the population parameter is between __ and __.
B. there is a __% chance that the population parameter is between __ and __.
C. __% of the sample will give a statistic between __ and __.
C:
A vs B: ____
()
Confidence interval: the middle __% of values of a sampling distribution (e.g. 95%)
Interpreting the confidence interval for p
C is only true when empirical sampling distribution is used.
A (confident) vs B (chance)
“confident” because we are confident about the method we used to get this interval.
not “chance” because we don’t know if the interval includes the true value or not.
→ bootstrapping does not reflect the representation of the original sample (sampling bias or measurement error)