Chapter 6 Flashcards
population
n A group that includes all the cases (individuals, objects, or groups) in which the researcher is
interested.
Through the process of sampling—selecting a subset of
observations from the population—we attempt to…
generalize the characteristics of the larger
group (population) based on what we learn from the smaller group (the sample). This is the
basis of inferential statistics—making predictions or inferences about a population from
observations based on a sample. Thus, it is important how we select our sample
parameter d and ex
The term parameter, associated with the population, refers to measures used to describe the
population we are interested in. For instance, the average commuting time for the 15,000
commuter students on your campus is a population parameter because it refers to a
population characteristic.
In previous chapters, we have learned the many ways of
describing a distribution, such as a proportion, a mean, or a standard deviation. When used
to describe the population distribution, these measures are referred to as….
parameters. Thus,
a population mean, a population proportion, and a population standard deviation are all
parameters.
We use the term statistic when referring to a c
a corresponding characteristic calculated for the
sample. For example, the average commuting time for a sample of commuter students is a
sample statistic. Similarly, a sample mean, a sample proportion, and a sample standard
deviation are all statistics.
Thus, the major objective of sampling theory and
statistical inference is to provide
estimates of unknown parameters from sample statistics
that can be easily obtained and calculated.
probability
A quantitative measure that a particular event will occur
Probability sampling is a method that enables the researcher to s
specify for each case in the
population the probability of its inclusion in the sample
The purpose of probability
sampling is to
select a sample that is as representative as possible of the population.
in prob sampling, n. The
sample is selected in such a way as to allow…? what does prob sampling design let the researcher do
the use of the principles of probability to
evaluate the generalizations made from the sample to the population. A probability sample
design enables the researcher to estimate the extent to which the findings based on one
sample are likely to differ from what would be found by studying the entire population
Although accurate estimates of sampling error can be made only from probability samples,
social scientists often use nonprobability samples because
they are more convenient and
cheaper to collect. Nonprobability samples are useful under many circumstances for a
variety of research purposes
limitation of nonprob samples
Their main limitation is that they do not allow the use of the
method of inferential statistics to generalize from the sample to the population. Because
through the rest of this text we deal only with inferential statistics, we will not review
nonprobability sampling
three sampling
designs that follow the principles of probability sampling: (
the simple random sample,
(2) the systematic random sample, and (3) the stratified random sample.
Simple random sample
A sample designed in such a way as to ensure that (a) every member of the
population has an equal chance of being chosen and (b) every combination of N members has an equal
chance of being chosen.
The sample is a
simple random sample because (in hospital ex)
every hospital had the same chance of being selected as a
member of our sample of two and (2) every combination of (N = 2) hospitals was equally
likely to be chosen.
Systematic random sampling
A method of sampling in which every Kth member (K is a ratio obtained by
dividing the population size by the desired sample size) in the total population is chosen for inclusion in the
sample after the first member of the sample is selected at random from among the first K members in the
population.
for a stratified random sample, The choice of subgroups
is based on
what variables are known and what variables are of interest to us.
Stratified random sample
e A method of sampling obtained by (a) dividing the population into subgroups
based on one or more variables central to our analysis and (b) then drawing a simple random sample from
each of the subgroups.
Proportionate stratified sample
e The size of the sample selected from each subgroup is proportional to the
size of that subgroup in the entire population.
Disproportionate stratified sample
The size of the sample selected from each subgroup is disproportional
to the size of the subgroup in the population.
Proportionate sampling can result in
the sample having too
few members from a small subgroup to yield reliable information about them.
In a disproportionate stratified sample, the size of the sample selected from each subgroup
is deliberately made disproportional to the size of that subgroup in the population. For
instance, for our example, we could select a sample (N = 180) consisting of 90 whites
(50%), 45 blacks (25%), and 45 Latinos (25%). In such a sampling design, although the
sampling probabilities for each population member are not equal (they vary between
groups), they are known, and therefore, we can make accurate estimates of error in the
inference process.
4 Disproportionate stratified sampling is especially useful when we want
to compare subgroups with each other, and when the size of some of the subgroups in the
population is relatively small
The sampling distribution helps estimate the
likelihood of our sample statistics and, therefore, enables us to generalize from the sample
to the population.
Sampling error
he discrepancy between a sample estimate of a population parameter and the real
population parameter.
Although comparing the sample estimates of the average income with the actual population
average is a perfect way to evaluate the accuracy of our estimate, in practice, we rarely have
information about
the actual population parameter. If we did, we would not need to
conduct a study
This, then, is our dilemma: If sample estimates vary and if most
estimates result in some sort of sampling error,
how much confidence can we place in the
estimate? On what basis can we infer from the sample to the population?
n. Because it includes all possible sample values, the sampling
distribution enables us to
compare our sample result with other sample values and
determine the likelihood associated with that result.
7
Sampling distribution
The sampling distribution is a theoretical probability distribution of all possible
sample values for the statistics in which we are interested.
Sampling distributions are theoretical distributions, which means that
hey are never really
observed.
Constructing an actual sampling distribution would involve
taking all possible
random samples of a fixed size from the population. This process would be very tedious
because it would involve a very large number of samples. However, to help grasp the
concept of the sampling distribution, let’s illustrate how one could be generated from a
limited number of samples.
Sampling distribution of the mean
n A theoretical probability distribution of sample means that would be
obtained by drawing from the population all possible samples of the same size.
The Population:
We began with the population distribution of 20 individuals. This
distribution actually exists. It is an empirical distribution that is usually unknown to us. We
are interested in estimating the mean income for this population.
The Sample
We drew a sample from that population. The sample distribution is an
empirical distribution that is known to us and is used to help us estimate the mean of the
population. We selected 50 samples of N = 3 and calculated the mean income. We
generally use the sample mean (Ῡ) as an estimate of the population mean (μ)
The Sampling Distribution of the Mean
For illustration, we generated an approximation of
the sampling distribution of the mean, consisting of 50 samples of N = 3. The sampling
distribution of the mean does not really exist. It is a theoretical distribution.
Like the population and sample distributions, the sampling distribution can be described in
terms of its m
mean and standard deviation. We use the symbol μῩ
to represent the mean of
the sampling distribution. The subscript indicates the specific variable of this sampling
distribution
To obtain the mean of the sampling distribution,
add all the individual
sample means and divide by the number of samples (M =
50). Thus, the mean of the sampling distribution of the mean is actually the mean of
means:
Standard error of the mean
The standard deviation of the sampling distribution of the mean. It describes
how much dispersion there is in the sampling distribution of the mean.
the standard error of the mean formula tells us that
the standard error of the mean is equal to the standard deviation
of the population σ divided by the square root of the sample size (N).
Third, the variability of the sampling distribution is considerably smaller than the
variability of the population distribution. Note that the standard deviation for the sampling
distribution ( σῩ
.= 8,480) is almost half that for the population (σ = 14,687).
Note that as the sample size
increased, the sampling distribution became more compact. This decrease in the variability
of the sampling distribution is reflected in a smaller
standard deviation:
With an increase in
sample size from N = 3 to N = 6, the standard deviation of the sampling distribution
decreased from 8,480 to 5,995. Furthermore, with a larger sample size, the sampling
distribution of the mean is an even better approximation of the normal curve.
It is called the central limit theorem, and it states that i
if all possible random
samples of size N are drawn from a population with a mean μ and a standard deviation σ,
then as N becomes larger, the sampling distribution of sample means becomes
approximately normal, with mean μῩ
equal to the population mean and a standard
deviation equal to
( a funky formula on pg 299 if u wanna see it)))))))))))
. Thus, the larger the sample,
the more closely the
sample statistic clusters around the population parameter
Through the process of sampling, researchers attempt to
generalize the characteristics of a large
group (the population) from a subset (sample) selected from that group. The term parameter,
associated with the population, refers to the information we are interested in finding out. Statistic
refers to a corresponding calculated sample statistic
A probability sample design allows us to estimate the extent to which the findings based on one
sample are likely to
differ from what we would find by studying the entire population.
A simple random sample is chosen in such a way as to ensure that
every member of the population
and every combination of N members have an equal chance of being chosen.
In systematic sampling
every Kth member in the total population is chosen for inclusion in the
sample after the first member of the sample is selected at random from the first K members in the
population.
A stratified random sample is obtained by(2 STEPS)
(a) dividing the population into subgroups based on one
or more variables central to our analysis and (b) then drawing a simple random sample from each of
the subgroups
The sampling distribution is a
a theoretical probability distribution of all possible sample values for
the statistic in which we are interested. The sampling distribution of the mean is a frequency
distribution of all possible sample means of the same size that can be drawn from the population of
interest
According to the central limit theorem,
m, if all possible random samples of size N are drawn from a
population with a mean μ and a standard deviation σ, then as N becomes larger, the sampling
distribution of sample means becomes approximately normal, with mean μ and standard deviation
The central limit theorem tells us that with sufficient sample size, the sampling distribution of the
mean will be
e normal regardless of the shape of the population distribution. Therefore, even when
the population distribution is skewed, we can still assume that the sampling distribution of the
mean is normal, given a large enough randomly selected sample size