Topic 14: Sampling Flashcards
Learning outcomes:
- Identify the similarities between non-spatial and spatial sampling
- Recognize the different elements considered when selecting the number of samples
- Apply different methods for sampling and recognize their relative merits and drawbacks
k
Population
The total set of individuals or potential observations in a defined group
- eg., all the residents in Calgary
Sample
A subset of individuals or observations in the population
- Hopefully the sample represents the population
The Role of Sampling: Sampling helps us answer several difficult questions
- How large should the sample be?
- How/where should the samples be chosen?
- How much reliability will we have in results based on this sample
(all these revolve around how we can’t conduct a census of the entire population)
Sampling Units
The individual items in a sample, and the basic entity upon which observations are made
- May be discrete entities (eg., people, households, cities, etc.), points, or areas (eg., quadrants, strips, plots, pixels, etc.)
- Must be explicitly defined!
Sampling units must be selected to match the scale of the information desired
- eg., household income: households
- personal income: individual people
Steps for Sampling
Step 1: Conceptually define target population and target (
Step 2: designate sampled population and sampled area from sampling frame
Step 3: Select sampling design
Step 4: Design research and operational plan
Step 5: Conduct pretest
Step 6: Collect sample data
One important question to be addressed in a proper sample design is how large should the sample be to be representative?
- Less certainty with small samples
- More certainty with larger sample
- larger samples = more cost
What are the two commonly used strategies for sample-size determination?
Rules of thumb and formulas
*be careful with rule of thumb - need to know why they made those decisions
Formulas
The precision of the estimate of a population parameter is a function of the variance of the population, the sample size, and the allowable error
For determining the sample size necessary to estimate the population mean: n=(Zs/E)^2
n= number of samples
Z= desired level of confidence
s= standard deviation of a pilot sample
E=tolerable error
Tolerable error is inversely related: more samples = less error
Confidence level is directly related: more samples = higher confidence
method #2: n = (t^2*CV^2)/(E^2)
n = sample size t = student's t value for the specified probability CV = coefficient of variation E = tolerable error, expressed as % of the mean
Student’s t value: threshold for comparing small numbers of thing in statistical test
if you want statistical validity, you need 200+ observations
What is the sample size determination procedure?
- make a reasonable guess at the value of n
- How much time do you have? resources?
- Guess may come from previous studies - Look up critical Student’s t-value
- two tailed probability of obtaining a larger value - Select value for E (allowable error)
- 10-20% is a reasonable place to start
- how much error will you allow? - Select a value fro CV (coefficient of variation)
- Need prior estimate of variation - preliminary (pilot) sample?
- most of the coefficient variable is coming from the pretest - Calculate n
- Proceed iteratively until n is reasonable
- things to change: n and E
Where/how to choose samples?
- Now when you knoe that we need n samples, where or how do we choose them?
- There are many techniques designed to help achieve a sample that is ‘representative’ of the population
- The major issue to avoid is bias
- Under-representing or over-representing elements of the population because of inappropriate sample design
Sampling methods/designs:
Non-probability: Judgemental
- Personal judgement
- Personal knowledge or knowledge of other people who have done similar studies
Quota
Based on economics of a sample