Lecture 6 & 7: Sampling Flashcards
What is the difference between census and sample?
Census (most reliable way)
-Every individual in a population is evaluated
-EX Stats CA population Census
Sample
-Only a subset of individuals in a population are evaluated
-EX 5% of Canadians selected for a study
Why do we take a sample?
-Descriptive study: To describe characteristics of a population, who, what when etc EX more than 1billion adults are overweight and 300 million of them are obese
-Analytical study: To assess specific associations between risk factors (exposures) and disease (or other outcomes) ie to compare 2 groups EX children with pets vs without to see the difference
Why don’t we just take a census if its more accurate?
Keep in mind this involves all individuals of the country therefore resources are a huge issue. This is why stats CA only does them every 5 years
-Time
-Expense $$
-Logistics (need a list of everyone, need to get a hold of everyone)
-Need everyone to volunteer/participate
-Date quantity vs data quality
-but poor sampling can result in bias
What are the main stages to sampling?
- Determine WHO/WHAT to sample
- Determine HOW you’re going to choose these subjects
- Determine HOW MANY you’ll need to be confident in your findings
Why is it important to determine WHO you are sampling?
-How you choose your subjects will have an impact on the validity of your results (how accurate the results are for the population)
-If your subjects are not truly representative of the population of interest then your conclusions may be biased
What should you consider when choosing your subjects?
Want to make sure you are getting a good representation of the population of interest
-Establish criteria for participating BEFORE you start sampling
-INCLUSION CRITERIA= characteristics needed to be eligible for the study
-EXCLUSION CRITERIA= characteristics that would exclude or prevent someone from participating
What are the 3 areas/layers of a population when defining your populations?
outermost: Target population ex all Canadians
-Population to which it might be possible to extrapolate results
-May not always be clearly defined in write-ups
Middle: Source population: LIST (subset of target pop)
-Population from which the study subjects are drawn
-We should be able to list all members (sampling units) of this population (=sampling frame)
Innermost: Study Population:
-The individuals included in your study
What are the 2 types of validity?
External validity: How well can the study results be extrapolated to the target population? (from study population to target population)
Internal Validity: How well does the study related to the source population? (from study population to source population)
What should you consider when determining HOW to sample?
-Your sample strategy will determine the nature of any extrapolations you might make from the sample to the population
-From which groups should you choose subjects?
-How should you sample them?
What are the 3 different sample strategies?
- Non-probability sampling
-Convenience sampling
-Judgment sampling
-Purposive sampling - Probability sampling
-Simple random sampling
-Systematic random sampling
-Stratified random sampling - Others (either non- or probability)
-Cluster sampling
-Multi-stage sampling
What are the 3 types of non-probability sampling?
Probability is unknown for non-probability
Convenience sampling: sampling units are chosen bc they are easy to get (ex animals in traps, farms close to UoG)
Judgment sampling: the investigator chooses what they deem to be units that are representative of the population (PhD student made her survey include ppl that were from a farm and knew concepts she was talking about)
Purposive sampling: Sampling units are chosen on purpose bc of their exposure or disease status (in an analytic study)
What is the problem with convenience sampling?
-Ex in class where the method is determining how many dogs ppl have that are sitting in the first row
-Problem: not truly representative of the distribution of dog ownership bc service dogs and owners sit closer to the front etc
When do you use non-probability sampling?
Often used in analytical studies
Pros:
-Relatively cheap and easy
-Good for a homogeneous population
Cons:
-Can produce biased results if the subjects you select are not representative of the target population
-Can limit how far you can extrapolate your results
What is probability sampling?
-Uses some form of random selection process
-All individuals in a population have some non-zero probability of being selected for the study AND that probability can be calculated
What is simple random sampling? (the first type of random sampling we talked about)
- Simple random sampling
-A fixed % of the source pop is chosen using a formal random process (flip a coin, random # draw)
-All individuals have an equal chance of being chosen
-If done properly, the sample chosen should be representative of the population under investigation
-You need to known the sampling frame (and therefore total # individuals in your population) to use this method ie that list