Ch 13 - Binomial & Poisson distribution, Sampling distribution Flashcards
sample vs population
Population: the entire group of individuals in which we are interested but usually can’t assess directly.
Sample: the part of the population we actually examine and for which we do have data.
randomnees comes from
picking the sample / way pop is sampled
parameter vs statistic
A parameter is a number summarizing the population. Parameters are usually unknown.
A statistic is a number summarizing a sample. We often use a statistic to estimate an unknown population parameter.
Law of large numbers
As the number of randomly drawn observations (n) in a sample increases,
- the mean of the sample (x̅) gets closer and closer to the population mean m (quantitative variable).
- the sample proportion ( p hat ) gets closer and closer to the population proportion p (categorical variable).
The sampling distribution of a statistic is
the probability distribution of that statistic for samples of a given size n taken from a given population.
The law of large numbers describes _____
A sampling distribution describes _____
what would happen if we took samples of increasing size n.
what would happen if we took all possible random samples of a fixed size n
The mean of the sampling distribution of x̅ is
μ.
There is no tendency for a sample average to fall systematically above or below μ, even if the population distribution is skewed.
x̅ is an unbiased estimate of the population mean μ.
The standard deviation of the sampling distribution of x̅ is
σ/√n.
measures how much the sample statistic x̅ varies from sample to sample.
Averages are less variable than individual observations.
When a variable in a population is Normally distributed ___
the sampling distribution of the sample mean x̅ is also Normally distributed.
When the sampling distribution is Normal, we can standardize the value of a sample mean x̅ to obtain a ____.
This ___ can then be used to find ____
z-score
z-score
areas under the sampling distribution from Table B.
sampling distribution,
s/√n is its standard deviation (indicative of _____).
spread
Central limit theorem: When
randomly sampling from any population with mean m and standard deviation s, when n is large enough, the sampling distribution of x̅ is approximately Normal: N(m,s/√n).
How large a sample size for CLT
It depends on the population distribution. More observations are required if the population distribution is far from Normal.
- A sample size of 25 or more is generally enough to obtain a Normal sampling distribution from a skewed population, even mild outliers in the sample
- A sample size of 40 or more will typically be good enough to overcome an extremely skewed population and mild (but not extreme) outliers in the sample.
How do we know if the population is Normal or not?
If the population is much larger than the sample, the count X of successes in an SRS of size n has approximately the
binomial distribution B(n, p) with mean m and standard deviation s:
If n is large, and p is not too close to 0 or 1, this binomial distribution can be approximated by
the Normal distribution:
When randomly sampling from a population with proportion p of successes, the sampling distribution of the sample proportion p̂ [“p hat”] has mean and standard deviation:
p̂ is an unbiased estimator the population proportion p if
expected value = true value of the perimeter
The sampling distribution of p̂ is never exactly Normal. But as
the sample size increases, the sampling distribution of p̂ becomes approximately Normal.
The Normal approximation is most accurate for any fixed n when
p is close to 0.5, and least accurate when p is near 0 or near 1.
Normal Approximation
Binomial Vs Normal