Reading Quiz 9 Flashcards
parameter
a number that describes the population
in statistical practice, value of a parameter
is not known bc can’t examine entire population
statistic
a number that can be computed from the sample data without making use of any unknown parameters
in practice, use statistic to
estimate unknown parameter
mean of a population
fixed parameter that is unknown when use sample for inference
μ
mean of a sample
average of the observations in the sample
x bar (x̅)
sample mean is an estimate of the mean μ of the underlying population
sampling variability
the value of a statistic varies in repeated sampling
not fatal!
population proportion
p
sample proportion
p̂
p hat
used to estimate unknown parameter p
sampling distribution of a statistic
the distribution of values taken by the statistic in all possible samples of the same size from the same population
statistic produced from a probability sample or randomized experiment
has sampling distribution that describes how statistic varies in repeated data production
sampling distribution answers question
what would happen if we repeated sample or experiment many times?
formal statistical inference
based on sampling distributions of statistics
bias
means that the center of the sampling distribution is not equal to the true value of the parameter
sampling distributions allow us
to describe bias more precisely by speaking of the bias of a statistic rather than bias in sampling method
bias again
high variability
statistic as an estimator of a parameter may suffer from bias
variability of a statistic
described by the spread of its sampling distribution
spread is determined by sampling design and size of sample
larger samples give
smaller spread
as long as population is much larger than sample
10 times the size
spread of sampling distribution approximately same for any population size
sampling distribution of a sample proportion
choose SRS of size n from larger population with population proportion p having some characteristic of interest
let p̂ be the proportion of the sample having that characteristic
mean of sampling distribution of p̂
exactly p
also written as mu p̂ = p
unbiased
a statistic used to estimate a parameter is unbiased if the mean of its sampling distribution is equal to the true value of the parameter being estimated
unbiased estimator
sample proportion p̂ is an unbiased estimator of p
standard deviation of sampling distribution of p̂
sqrt (pq/n)
standard deviation of p̂ gets smaller as the sample size n increases bc n appears in the denominator of the formula
p̂ is less variable in larger samples
p̂ less variable in
larger samples
important rule of thumb #1
use formula for standard deviation of p̂ only when population is at least 10 times as large as sample
when N greater than or equal to 10n
interested in sampling only when population is large enough to make taking a census impractical
important rule of thumb #2
use normal approximation to sampling distribution of p̂ for values of n and p that satisfy np greater than or equal to 10 and nq greater than or equal to 10
sampling distribution of a sample mean
draw an SRS of size n from a population that has the normal distribution with mean μ and standard deviation σ
mean of sampling distribution of xbar
exactly μ
also written as μ xbar = μ
sample mean xbar
is an unbiased estimator of μ
standard deviation of sampling distribution of xbar
σ/(sqrt(n))
just like with proportions, should only use this equation when N greater than or equal to 10n
three situations to consider when discussing shape of a sampling distribution of xbar
(sample means specifically, not same proportions)
first situation
if the population has a normal distribution then the shape of the sampling distribution of xbar is also normal, regardless of sample size
second situation
if the population shape is non normal (or we aren’t told its shape) and there is a small n then the shape of the sampling distribution of xbar is similar to shape of the population
third situation
if the population shape is non normal (or we aren’t told its shape) and there is a large n with a finite standard deviation then the shape of the sampling distribution is approximately normal (definition of central limit theorem)
central limit theorem
if the population shape is non normal or we aren’t told its shape and there is a large n with a finite standard deviation then the shape of the sampling distribution is approximately normal
aka fundamental theorem of statistics
rule of thumb for large sample size
n greater than or equal to 30
A parameter is a number describing a ____; a statistic is a number describing a ____.
A. population, sample (Notice: parameterpopulation, statisticsample)
What symbols are used in our book’s notation to represent a population mean, sample mean, sample
proportion, and population proportion, respectively?
A. , x , pˆ , and p.
Suppose you were to take a large number of samples (all the same size) from a population, compute the mean of each, and plot a histogram of the sample means that you obtain. This histogram would approximate the shape of the ______ ________ of x .
sampling distribution
The sampling distribution for a proportion or mean changes as the number in the sample increases: the mean of that sampling distribution (choose one: increases, stays the same, or decreases) and the variance of the sampling distribution (choose one: increases, stays the same, or decreases).
stays the same, decreases
If the mean of a sampling distribution is the true value of the parameter being estimated, we refer to the statistic used to estimate the parameter as being _____.
unbiased
True or False: if a statistic is unbiased, the value of the statistic computed from the sample equals the population parameter.
A. False. Samples vary. It’s only the mean of all possible samples that equals the population parameter for an unbiased statistic.
True or False: the variability of statistics are very sensitive to the size of the population from which the samples are drawn.
A. False. The sample size is much more important than the population size.
. An organization wants to sample with equal accuracy from each state of the USA. Would it make more sense to sample 2000 from each state, or 1% of each state?
A. 2000 from each state, because the sample size determines the accuracy, and you don’t need a greater sample with a higher population.
What are the mean and standard deviation of sample proportion?
A. The mean is p, and the standard deviation is ( pq) / n .
If the sample is a substantial fraction of the population, then the assumption of independence that leads to the binomial distribution is violated. How many times bigger should the population be than the sample, so that we don’t worry about this?
at least 10 times bigger
True or False: The standard deviation of the sampling distribution of a proportion is only approximately
( pq) / n ; this approximation is most accurate when np 10 and nq 10.
A. False. The standard deviation of the sampling distribution of a proportion is always exactly
( pq) / n . But that distribution is approximately NORMAL when np and nq are 10.
If you know the population proportion, how do you use the normal approximation to figure out the
probability that the proportion obtained from a random sample of size n will be between two given values?
A. You use p and ( pq) / n as the mean and standard deviation, and with these compute a z score, then find the proportion of the normal curve between those two z-scores using the Standard Normal
Probabilities Table. This is the probability that the sample proportion will fall between those values.
How do the sampling distributions of means compare with the distributions of individual observations? They are less _____ and more _____.
variable, normal
Suppose you have a population with mean and standard deviation . What are the mean and standard deviation of the sampling distribution for means with sample size n?
A. The mean of the sampling distribution is and the standard deviation is / n .
Under what conditions will the sampling distribution of the mean have an exact normal distribution, no
matter what the sample size is?
when the population is normally distributed
What does the central limit theorem tell us?
A. That as the sample size gets larger, the sampling distribution of the mean approaches the normal,
regardless of the distribution of the population from which the observations are drawn.
True or False: suppose that income in a large country is not normally distributed, but is very skewed. The central limit theorem tells us that if we were to collect several very large samples and compute the mean income for each sample, those means would be approximately normally distributed, even though the incomes in the population are not normally distributed.
true
Why do you think the central limit theorem is so “central” to statistics?
A. Because it enables us to use normal probability calculations to answer questions about sample means even when population distributions are not normal. Those questions include the big idea of confidence intervals: how likely is the right answer to be between these two bounds. Thus the central limit theorem helps us say, “There’s x probability that the true mean of the population is between a and b.”
calculator for finding sample
math PRB 5 randint (lower bound, higher bound, one or two higher than sample size bc can’t allow duplicates)
population vs sample
population: μ, σ, p
sample: xbar, s, p̂
increase sample size
mean of distribution stays the same
spread of distribution decreases
simulation
not all possible combinations
eed to include every possible combination in sample