Sampling Flashcards
What is sampling? Describe the process
Sampling is choosing a subset of units from the target population. There are several different procedures (= sampling methods) for choosing units from the target population
Define sampling population
- Target population = defined target population of your study
- Sampling frame = entity of units* from which your sample is drawn (with the chosen sampling method) *In consumer research, the units are typically ‘people’
- Ideally, the sampling frame includes the entire target population = ideal coverage (in reality, this is often difficult to achieve)
Define over- and undercoverage and coverage error
Undercoverage = Incomplete sample frame
Some units of the target population are not included in the sample frame
Overcoverage = Erroneous inclusion of units
• Ineligible units: Wrongly inclusion of frame units which are not part of survey population
• Duplicated units: Units are part of target population, but appear more than once in the frame (e.g. through combination of lists)
Coverage error = Deviation from ideal coverage
Not every unit in the target population has a known, nonzero chance of being selected into the sample (Biemer and Lyberg 2003: p. 64, Groves et al. 2004: p. 70)
• – Coverage (error) often unknown
• – Great influence of survey mode on coverage (error) (Lohr 2008: p. 102)
What is a representative sample
• A sample is representative if it reflects the target population accurately in all relevant variables*
*in consumer surveys, e.g. age, gender, income, education, urban-rural, etc.
• For representative results, the units included in the sample should be chosen randomly
(i.e. with probability sampling methods)
• Representativeness of the sample is a precondition for generalising the results to the target population
What are the two types of sampling units?
Full sampling unit: Target population sampled in its entirety (e.g. population census)
• Usually impossible or impractical with consumers or households
• Lots of time and high costs involved
• Applicable for surveys with a small number of units, e.g. students of a study programme, companies of a specialised industry
Partial sampling unit: Only part of target population is sampled
• Inference from sampled units to total population
• Aim of most exact (representative) picture of population
• Less time and lower costs involved
Define probability sampling
Characteristics
• Every unit of target population has a known and nonzero chance of being included in sample
• Sampling procedure involves an element of randomization
• Sample selection is objective → Sampling error can be estimated
• Probability samples have found to be more accurate in representing the population characteristics than nonprobability samples
Probability sampling and survey mode
• Probability sampling methods are regularly used for internet/email, face-to-face, mail, telephone surveys
• Outside Denmark great problems with internet surveys due to exclusion of population without internet access
What are the types of probability sampling?
simple random sampling, systematic sampling, stratified sampling, cluster sampling
Describe simple random sampling
• Urn model = lottery sampling
• The elements to be included in the
sample are taken randomly from population
• Each unit of the sampling frame needs to be assigned a number
• Generation of series of random numbers (usually by random number generator)
Simplified example (not realistic):
• – Target population N = 30 people, sample size n = 4
• – Random selection of 4 units
• – Units included in sample: e.g. 8, 15, 19, 27
Characteristics
• Precondition: the target population must be completely listed somewhere (e.g. address register, students register)
• Every unit in the sample frame has the same chance to be included in sample
• Good choice if little is known about characteristics of population
• Theoretically, the drawn sample can have very unusual properties
Describe systematic sampling
Characteristics
• Every ith unit of the sampling frame is included in the sample (systematic because i is a fixed value, e.g. 3, 10, 100)
• Random choice of starting point
• Selection procedure (value of i) depends on
target population size N and sample size n
• Precondition:
• List of target population must exist OR
• People can be selected by order of events, e.g. every ith person visiting a website, every ith person entering the door (e.g. mall / supermarket)
• Faster and less costly to conduct than simple random sampling
• Often used in customer satisfaction surveys with website visitors and mall intercepts
Describe stratified sampling
Characteristics
• Division of target population into subgroups (= strata) e.g. division by age, income, or shopping frequency
• Stratum A: Frequent / loyal customers
• Stratum B: Infrequent customers
• Example: Retail chain offering a loyalty programme wants to know
• – how satisfied the members are with the loyalty programme, and
• – reasons for using / not using the app
• Preconditions:
• List of units in sampling frame (same as in simple random sampling)
• Additional information on units in sampling frame (e.g. age, gender, shopping frequency)
• Stratification ensures that sample contains representation from population subgroups of interest
• Stratified random sampling allows for more accurate sample results than simple random sampling (reduced sampling error)
• Rather high costs and time involved
Define cluster sampling
Characteristics
• Division of population into clusters (subgroups), e.g. by geographic areas (“area sampling”)
• The clusters are heterogeneous (unlike the strata in stratified sampling)
• Not all clusters are sampled, but each cluster has equal chance to be included in sample
• Stage 1: Selection of some clusters by simple random or systematic sampling
• Stage 2: Selection of units from selected clusters by simple random or systematic sampling
Define non-probability sampling
Characteristics
• Sample selection is subjective
(and not objective like in probability sampling)
• Probability of units being chosen is unknown
• Strictly speaking, findings cannot be generalized to the whole target population
• Usually easier and cheaper to conduct than probability sampling methods, especially with internet surveys
What are the types of non-probability sampling?
convenience sampling, judgement sampling, quota sampling, snowball sampling
Describe convenience sampling
Characteristics
• Units are included in the sample based on easy accessibility
• ‘Extreme’ form: friends, social media network, students from class, etc.
• ‘Mild’ form: strangers are approached, e.g. people on the street, Amazon Mechanical Turk
• Useful for testing survey questions or survey procedures
• No generalisation of results possible, not representative
Describe judgement sampling
(=purposive sampling)
Characteristics
• Sampling units are selected based on (subjective) ‘expert’ judgment
• Units are believed to be typical for target population
(e.g. certain supermarkets chosen by market researcher)
• Useful for survey pre-tests
• No generalisation of results possible, not representative
• OBS: With very small sample size (< 10), judgment sampling may provide more accurate information than random sampling
Describe quota sampling
- Aim: include certain percentage of target population in the sample with particular characteristics of interest (e.g. age, income, gender)
- Percentage of individuals with certain characteristics in sample should be identical with percentage in total population
- Pre-condition: Knowledge about distribution of the characteristics in the population
- Strictly speaking no generalisation of results possible, not ‘representative’
Describe snowball sampling
Characteristics
• Initial respondents provide names of additional respondents to be included in the sample
• Useful for so-called ‘rare populations’, i.e. target populations that form a very small part of the total population
• Examples: vegan consumers, people with a special hobby
• Cost reduction due to low search effort
• Threat of bias: People are likely to recommend other people who are similar to them
Strengths and weaknesses of nonprobability sampling
What are the pros and cons of probability and non-probability sampling?
Probability sampling
• Sample units are chosen randomly, inclusion of units is objective
• Used for official statistics and if the aim is to have representative results
• Reduced coverage error
• Usually very expensive
Nonprobability sampling
• Sample relies on personal judgment / subjectivity
• Not ‘representative’
• Comparatively low cost
• Often used by market research institutes or for pre-tests
• With very small sample size (< 10 units) often more accurate information than through random sampling
What would a bigger or more diverse sample do, in terms of how the results might look different?
• It would increase the possibility of getting significant results, e.g. in altruistic values, or in the exploratory results, it would become more likely that we would detect an effect if there was an effect.
Convenience sampling: Pros and cons
- Pros: Inexpensive, convient,
* Cons: Subjective, no generalizability
Simple random sampling: Pros and cons
- Pros: Objective measure, generalizable results
* Cons: Expensive, time-consuming
What examples of overcoverage or undercoverage could there be in your study?
- Overcoverage: Wrong inclusion of units in sample population, e.g. foreign users (non-american users) in Mturk
- Undercoverage: Some people are not included (some states are not represented, income groups)