feb 24th Flashcards
Sampling error:
some sort of variation , the degre to which the sample frame fails to account for all of the population
sampling and sample size
The larger the sample the more accurate (around 100 people can be quite accurate)
Can see there are differences between the sampling techniques in their results
Sample
a subset of members of your population→ the group of units who you select to be in your study
ask subset
Accuracy varies on: representativeness( and sample size – if you ask only students that not representative
Practical
sample frame
listing of population, from which you will draw your sample
A sample frame is a complete, accurate, and up-to-date list of the population from which a sample is drawn
Census:
asking everyone in the population
Accurate
Impossible
Impractical
Sampling process
Define population interest
Identify the sampling frame→ pretty much impossible very difficult
Select the sampling method→ what sample technique (random, snowball, etc)
Determine sample size
Probability
Probability: every member of the population has the same probability of being selected (non zero probability) , more representative
More likely to be representative subset of the population
Allows you to calculate margin of error
More difficult and expensive
Non probability
Chances of selection for the various elements in the population are unknown
Unlikely to be representative subset of the population
Cannot calculate margin of error
Usually chosen due to ease and cost
Usually do this
Probability sampling different types
simpel randmom
systemic sampling
stratfied
clister sampling
simple random
Simple random (just randomly select people form the pop)–> prob of selection is the same for all population memberss
systematic sampling
order population randomly chose every third member or 5th member etc, it is random as well cause your randomly selecting members
stratified
create mutually exclusive and collectively exhaustive subsets then sample randomly from each . creating segments of consumers and within each segment, we have people who share each traits, you select people within the clusters here
creating groups then randomly selecting from those groups
Divide the Population → Identify groups (strata) based on shared characteristics (e.g., age, income, education).
Ensure Coverage → Make sure all individuals belong to only one group and that all groups together represent the entire population.
Randomly Select Samples → Pick people randomly from each group to ensure fairness.
cluster sampling
create mutually exclusive and collectively exhaustive subsets them randomly select subsets . ex: looking at neighbors or schools and then randomly selecting schools → creating these groups where the people within the groups are not necessarily the same and then you are randomly selecting the clusters
creating groups and then randomly selecting whole groups
Ex: looking at mcgill population
26765 undergraduate students
Simple random: prob of selection is the same for all population members . assign numbers to each member in your population. Use random (find the rest of sentence)
systematic : selecting every x member of our sample, i can choose the starting point (like start at the 2nd person but then every 3rd member is chosen)
Stratified: randomly select people from each segment, can be better than random sampling (we get everyone we want) . ex: randomly selecting from each neighbourhood
Cluster: looking at locations like neighbourhoods , will randomly select different neighborhoods and survey every person in the selected neighborhoods
non probability sampling
convenience
purposive
quota
snowball
convenience:
select participants who are easily accessible (family, friends, etc). Issue with it is that it might not be representative
purposive
select participants using researchers best judgment about representativeness (you selecting you can participate and who cannot, usually have some sort of inclusion criteria→ only want people who are born in mtl, important criteria that i think are representative)
quota
select participants based on variables of interest
Select participants based on variables of interest
Unlike stratified sample:
Selecting from groups is not done randomly
Why would someone engage in quote sampling
Representativeness→ trying to find people who are representative of your population
Trying to find a representative sample for your research question
snowball
select initial participants in some manner; ask initial participants for referrals to reach additional participants → limitation: hard to access certain populations (ex: senior citizens)
Select initial participants in some manner; ask initial participants for referrals to reach additional participants
The idea is to access hard to reach populations, once you get a hard to reach consumer they can refer other hard to reach consumers
determining sample size
Statistical methods→ applies only to probability samples
How to calculate real world significance→ effect size, you calculate your effect size→ you need to predict how significant your results would be int he real world, ex: you have a control and treatment group, you look at the mean of the treatment group and subtract it from your control group and do the rest of the formula
You can have issues with predictions, maybe you dont know how many people you need
Easier to use rules of thumb
Ad Hoc Methods
Rules of thumb
100 per major subgroup
20-50 per other subgroup
30 for qualitative data
budget constraints
Comparable studies
Number of sub groups