Lecture 7 (Sampling Notes) Flashcards
the process of selecting a representative portion of a population
Sampling
representative means __
- reproduces the important characteristics of the population
- Uses the EPSEM method (Equal Probability of Selection Method)
uncertainty that arises from working with a sample rather than the entire population
Sampling error
A large sample is more likely to include a true cross-section of the population (t or f)
t
A large sample is more likely to include a true cross-section of the population (t or f)
t
When the procedures used to select the sample tend to favor the inclusion of individuals in the population with certain population characteristics
Sampling bias
why sample?
- It is not necessary to take a complete census
- Sampling requires less costs, time, and effort
- The population may be infinite
- The population may be empirically definable, but not practically available
- Sample also allows for a wider scope and more in-depth study
sampling process
Define the population
Construct the sampling frame
Select the sampling design
specify the info to be collected
collect data
difference between descriptive and inferential
inferential draw conclusion/generalize a larger population
the population to whom a researcher wishes to generate the results of a study
target
the smaller portion of the target population to whom the researcher actually has access to
study
the group of people selected to be part of the study
sample
ordered list of sampling units representing the population from which the sample will be drawn
- it must include all individuals in the population (it must be exhaustive)
- each individual element of the population must appear once and only once
Sampling frame
procedure used to select individuals from the list of samples
Sampling design
extended test of data collection procedures to be used in a study in advance of the main data collection
- to check instruments, data loggers and all other logistics
- sometimes reveal deficiencies
pilot or pre-test
the probability of any individual member of the population being picked for the sample can be specified
probability sampling
advantage of probability sampling
sampling error can be calculated
each element in the pop. has an equal probability of being selected as a sample
SRS/Simple Random Sampling
adv and disadv of SRS
ADVANTAGE: simple and easy to apply when the pop. is small
DISADVANTAGE: cumbersome for large pop., not the most statistically efficient method
- Possible not to get a good representation due to “luck” of the draw
- randomness does not guarantee representativeness
every kth element starting from a randomly chosen point
Systematic Random Sampling
subset of the population that has at least one common attribute
Stratum
Sample is obtained by forming classes, or strata, in the population and then selecting a simple random sample from each
Stratified Random Sampling
adv and disadv of Stratified Random Sampling
ADVANTAGES: reduces sampling error, decreasing the likelihood of obtaining an unrepresentative sample
- Assumes representation of key subgroups
DISADVANTAGES: still cannot guarantee representativeness if the sampling interval corresponds to a pattern
adv and disadv of stratified random sampling
ADVANTAGES: reduces sampling error, decreasing the likelihood of obtaining an unrepresentative sample
Assumes representation of key subgroups of the pop.
DISADVANTAGE
May require additional prior info about the population and strata
proportionate allocation vs optimum allocation
PA = Size of sample selected from each stratum is proportional to the size of the stratum in the entire population
- ADVANTAGE: composition of the sample in terms of its representativeness can be considered
OA = proportionality of the sample size to the stratum is not considered
- LIMITATION: cannot generalize directly to the population without applying weights
- The population is first divided into mutually exclusive and exhaustive clusters, then selecting a simple random sample from each.
- Clusters should be Small scale representation of the population
- Ideally, each cluster should be internally heterogeneous
Cluster Sampling
adv and disadv of cluster sampling
ADVANTAGES: efficient when the population to be sampled is geographically dispersed over a large area
- Restricting the sampling to a small number of clusters
DISADVANTAGES:
Often gives poor results; sample error is higher
sampling is done in a hierarchy of stages with the other four techniques applied in various combinations
Multi-stage Sampling
adv and disadv of Multi-stage Sampling
ADVANTAGES: convenient and efficient
DISADVANTAGES: higher sample error
- Entails much planning before sampling is done
sample are selected in some non-random manner
non-probability / non random sampling
adv and disdv of non-probability / non random sampling
ADVANTAGE: useful when time is limited, a sampling (frame) not available, budget is tight
DISADVANTAGE: cannot generalize the general population
only convenient or accessible, members of the population are selected
- Useful for pilot or pre-testing
Convenience sampling
Personal judgment is used to decide which individuals of a population are to be included in the sample
- Useful when there is a limited number of people that have expertise in the area being studied
- likely to overweigh samples that are more readily accessible
Purposeful or Judgemental sampling
an attempt to obtain a representative sample by acquiring quotas from given sample
Quota sampling
Sample consists of individuals who self-select from the population, rarely representative
- Individuals are usually more motivated or have a higher interest in the topic
Voluntary sampling
respondents are accumulated by using each individual as an informant or source of sample
snowball sampling
The probability distribution of a sample stat/sampling distribution
CLT
Taking all possible samples of size n from a population, calculating the statistic for each sample, and drawing the distribution of those values
CLT
adv and disadv of systematic RS
ADVANTAGES: simpler than SRS, gives a good spread across the pop.
DISADVANTAGES: still cannot guarantee representativeness if the sampling interval corresponds to a pattern