1.5: sampling methods Flashcards
Sampling
used to get information about a parameter of a population
–>A statistic from a sample is used because measuring the parameter from the population is either not possible or cost-prohibitive
The two types of sampling methods
Probability sampling
Non-probability sampling
Probability sampling
Every member of the population has the same chance of being selected
The sample created is usually representative of the population.
Non-probability sampling
Non-probability considerations (such as the convenience to access data or the sampler’s judgment) are used in sample selection
The sample created may not be representative of the population
sampling methods resulting from probability sampling
Simple Random Sampling
Systematic Sampling
Stratified Random Sampling
Cluster Sampling
sampling methods resulting from non-probability sampling
Convenience Sampling
judgmental Sampling
simple random sampling
each population element has an equal probability of being selected
also often called just a random sample
requires randomness
–> could be done by assigning each member of the population a random number and using a computer program or table of random digits to choose the members
when is simple random sampling useful
when the data is homogenous
systematic sampling
chooses every kth member until the desired sample size is reached
useful if the analyst cannot identify all members of a population
The sampling error
the difference between the sample statistic and the population parameter (e.g., the sample mean and the population mean)
The sampling distribution of a statistic
the distribution of all statistic values calculated from the same sample size from the same population
stratified random sampling
the population is first divided into subgroups (strata) based on some criteria
Simple random samples are drawn from each subgroup in proportion to the subgroup’s relative size to the entire population
This method results in less variance than estimates derived from simple random sampling
It makes sure that the population subdivisions of interest are captured in the sample set
what is stratified random sampling used for?
commonly used to create portfolios that are meant to track a bond index
–> First, the entire population of bonds in the index is divided into subgroups based on factors such as maturity, sector, credit quality, etc.
–> The manager then selects a sampling of bonds from within each subgroup
cluster sampling
divides the population into subgroups known as clusters
difference between stratified random sampling and cluster sampling
unlike stratified random sampling which defines subgroups based on certain criteria, cluster sampling divides the entire sample into mini-representations of the population
–> In other words, each cluster will consist of samples with different characteristics
cluster sampling steps
- Certain clusters are selected using simple random sampling.
- a one-stage cluster sampling selects all the members in the sampled clusters, while a two-stage cluster sampling randomly selects a subsample from each sampled cluster.
The major advantage of cluster sampling and its con
it is a time-efficient, cost-effective method of probability sampling a large population
con: the accuracy of the results can be skewed if the chosen clusters are not representative of the overall population
Convenience sampling
selects samples based on how accessible they are for the researcher
This method allows samples to be collected quickly at a low cost
However, since data are selected conveniently, they may not be representative of the population
judgmental sampling,
the researcher handpicks samples based on their knowledge and professional judgment
This sampling method is beneficial when there is a time constraint because the researcher can quickly select a more representative sample using their expertise
However, the samples selected may be subject to the researcher’s bias, resulting in skewed results
Which of the following methods is most appropriate to use when only some members of a finite population can be identified?
A
Systematic sampling
B
Simple random sampling
C
Stratified random sampling
A
Systematic sampling
Perkiomen Kinzua, a seasoned auditor, is auditing last year’s transactions for Conemaugh Corporation. Unfortunately, Conemaugh had a very large number of transactions last year, and Kinzua is under a time constraint to finish the audit. He decides to audit only the small subset of the transaction population that is of interest and to use sampling to create that subset.
The most appropriate sampling method for Kinzua to use is:
A
judgmental sampling.
B
systematic sampling.
C
convenience sampling.
A
judgmental sampling.
The central limit theorem states the following:
The sampling distribution of a sample mean X¯ will be approximately normal with a mean of μ (the population mean) and variance of (σ^2)/n, provided the sample size n is large (usually greater than 30)
This is true for a population with any probability distribution provided it has a finite variance σ^2
.
According to the central limit theorem, when the sample size increases, what will happen to the distribution of the sample mean?
it will converge to a normal distribution
This is true regardless of the actual distribution of the population
Notice that the variance of the sample mean is (σ^2)/n.
Assuming a constant population variance σ, when n increases, the variance of the sample mean decreases.
–> Said differently, with a large enough sample, the sample mean X¯ will be very close to the population mean μ
How is the standard error of the sample mean X¯ defined as?
σX¯ = σ/√n or sX¯ = s/√n
–> The first is used if the population standard deviation σ is known, which is rare
–> The second definition requires an estimate of s
how do we calculate the s in a standard error if we need to estimate it?
s^2 = ∑(Xi − X¯)^2 / (n−1)
the difference between the standard error and standard deviation
the standard error describes the accuracy of an estimate (from sampled data) relative to its true value
the standard deviation describes the dispersion of data around its mean
A population has a non-normal distribution with mean μ and variance σ^2
The sampling distribution of the sample mean computed from samples of large size from that population will most likely have:
A
the same distribution as the population distribution.
B
its mean approximately equal to the population mean.
C
its variance approximately equal to the population variance.
B
its mean approximately equal to the population mean.