Unit 3 - sampling Flashcards
What is the population
Big data under study
What is a sample
Small chunk of the population
Important characteristics of a sample so that you can generalize and estimate
Sample size
Use of randomization
When is a sampling method considered biased
In some critical way it consistently results in samples that do not represent the population
What is voluntary response bias and where does it happen
On voluntary surveys
May be composed with strongly opinionated people, especially those with negative opinions on a
subject.
Convenience survey bias
Hard to generalize for the entire population
What is undercoverage bias
Inadequate representation
Groups in the population are left out of the process of choosing the sample
What is response bias
Question itself can lead to misleading results
people don’t want to be perceived as having unpopular
don’t want to admit to having committed crimes
What is nonresponse bias
low response rates, occurs when individuals chosen for the sample can’t be contacted or can refuse to participate,
What is quota sampling bias
interviewers are given free choice in picking people in the and they attempt to pick without randomisation
What is wording bias
when nonneutral or poorly worded questions lead to
very unrepresentative responses
OR the order in which questions are asked.
Does increasing sample size reduce bias
no
What is a Simple Random Sample
one in which every possible sample of the desired size has an equal chance of being selected
How to construct an SRS
Assign numbers
Use a computer random number generator to generate distinct numbers in the range.
Link selected numbers with corresponding individuals.
Adv and disadv of SRS
Adv:
Simple - makes it easy to interpret data
Requires minimal knowledge about the population.
Unbiased - accurate
disadv:
May not be as precise as others
time consuming
difficult to execute (esp if population is large)
Could leave groups out that you want to be represented.
What is cluster random sampling
Population is divided into clusters of individuals that are similar. An SRS of the clusters is taken and then all individuals in the selected clusters are taken.
Adv and disadv of cluster
Adv
Unbiased
Easy to perform
Disadv
Very high variability esp if clusters are homogenous which renders this sampling method useless
What is stratified random sampling
Divide the population into strata based on similar characteristics.
Take an SRS within each stratum.
All selected individuals make up one larger sample.
Adv and disadv of stratified random sampling
Adv
Unbiased
Very precise
Disadv
Can be very hard to exec
What is systematic random sampling
Start at a random point in the population then sample at a fixed period interval
eg: every 5th person, every 20th person, etc.
What is bias
What is variability
4 types
Bias = accuracy Variability = precision
Biased and high var = inaccurate and imprecise
Unbiased and high var = accurate but imprecise
Biased and low var = inaccurate but precise
Unbiased and low var = accurate and precise.
How to write about bias (undercoverage, nonresp, etc)
Identify the pop and sample
Explain how sampled individual might differ from general population
Explain how this leads to overestimate or underestimate.