stats 4 Flashcards
Population
every possible relevant case
-Every person or example of the thing you want to study.
Census
an accounting of a population
Even when we go all out to get information about every U.S. citizen in the Census, we still have
lots of nonrespondents.
Instead, what political scientists do is to get a
sample that is representative of the
population they want to study.
Sample
a subset of cases that is drawn from
an underlying population
You cannot observe every case, so instead researchers
sample a portion of those
cases.
To get a representative sample, researchers use
random sampling
Random sample
a sample such that each member of the underlying population has an equal probability of being selected.
Convenience sample
a sample such that each member of the underlying population does NOT necessarily has an equal probability of being selected.
if we only have a sample, how can we make inferences about the entire population?
we can use statistical inference
Statistical inference
the process of using what we
know about a sample to make probabilistic statements about the broader population.
A sample will not tell us with
perfect precision what
the statistical data (e.g., mean and standard deviation) will be for a larger population.
It will, however, give us not only a good
estimate of those metrics, but also tell us how likely it is that those estimates accurately represent the population
The more samples we take of the population, the closer we will get to being able to accurately estimate the true population
parameters
Parameters
parameters are numerical values that
describe certain characteristics or features of a sample or an entire population, such as the mean, variance, or proportion.
Central limit theorem
a fundamental result from
statistics indicating that if one were to collect an infinite number of random samples and plot the resulting sample means, those sample means would be distributed normally around the true population mean
Distribution
a mathematical function that describes the
probabilities of different outcomes in a random variable
or set of data
A statistical distribution shows how often
different outcomes or values occur in a set of data
Different data generating processes create different
types of distributions.
Data generating process
the underlying mechanism or
model that describes how data is produced and collected
Outcomes are considered independent if
the occurrence of one outcome does not affect the probability of the occurrence of another outcome.
Independent outcomes
an outcome whose occurrence is not influenced by the outcome of another event.
Normal distribution
a bell-shaped statistical distribution that can be entirely characterized by its mean and standard deviation.
Normal distribution
= bell curve = Gaussian distribution
the normal distribution has two mathematical
properties that make it extra-special
- The normal distribution is symmetrical around its
mean. So the mean, median, and mode are all equal to one - The normal distribution has a predictable area under
the curve within specified distances of the mean
standard deviation numbers
- One standard deviation in each direction captures
68.3% of the area under the curve. - Two standard deviations in each direction captures
95.5% of the area under the curve. - Three standard deviations in each direction captures
99.7% of the area under the curve.
As long as a distribution is normal, it follows the..
68-95-99 rule
This rule does not apply to other distributions like the
Bernouli or the uniform distributions
The mean of all the sample means will be the
mean of the entire population!
Sampling distribution
a hypothetical distribution of
sampling means
Standard error (of the mean)
the standard deviation of the sampling distribution means.
-It is the measure of the variability or dispersion of
sample means around the population mean
The normal distribution’s 68-95-99 rule allows us to
calculate
confidence intervals.
Confidence intervals
a probabilistic statement about the likely value of a population characteristic based on the observations in a sample.