SOC200 - Logic of Sampling (Chapter 6) Flashcards
Sample
selection of observations from a pop
Population
all possible data values that could be observed
Sampling
process of selecting observations from target pop
NONPROBABILITY SAMPLING
Selecting observations in way that doesn’t ensure results generalizable to target pop
•if members of pop don’t have an equal chance to be selected - can’t be sure that it represents whole pop
•complex social phenomena
NONPROBABILITY SAMPLING
Used in situations where it virtually impossible/unnecessary to ensure generalizability of results
hard to find people or groups with unique conditions
Homeless, Deviant cases: not usual cases, unique to the norm (gay military), Complex social phenomena: (initiation into a frat, cult)
APPROACHES to NON- PROBABILITY SAMPLING:
1. Relying on Available Subjects
Sample limited to available subjects.
Good for pretesting questionnaire/providing info for pilot study
APPROACHES to NON- PROBABILITY SAMPLING:
1. Relying on Available Subjects
Undergraduate students common source of data for this sampling approach
FIVE MAIN APPROACHES to NON- PROBABILITY SAMPLING: 2. Purposive Sampling
Selecting sample based on own knowledge of sample + purposes of study.
widely used for studying deviant cases to improve understanding of general pattern (Gays in the military, Male midwives)
zeroing in deviation, selecting samples based on what you want to study
FIVE MAIN APPROACHES to NON- PROBABILITY SAMPLING: 3. Snowball Sampling
network based - researcher interviews few members of target pop + asks to be referred to other members of they know
Good for locating members of unique possibly marginal pop
•might be easier to do this than having a blanket survey
•Used primarily for exploratory purposes
FIVE MAIN APPROACHES to NON-
PROBABILITY SAMPLING: 4. Quota Sampling
methodical - knowing the demographic breakdown of target pop (52% female)
select people fitting this combo of characteristics (quota frame)
•stratify pop based on demographic breakdown
FIVE MAIN APPROACHES to NON-
PROBABILITY SAMPLING: 4. Quota Sampling
E.g. If you decide to interview 100 people in Toronto, then you should have 52 females in your sample
•problem when demographic breakdown is not accurate (not up to date)
FIVE MAIN APPROACHES to NON-
PROBABILITY SAMPLING: 4. Quota Sampling
Good when time + sampling budgets limited/high level of accuracy is not needed
•Selection of sample elements within given cell may be biased even though proportion of population is accurately estimated
FIVE MAIN APPROACHES to NON- PROBABILITY SAMPLING: 5. Informers
Collaborating with member “inside” group want to study
Beware of reliability + validity of the info: can’t randomly select, often don’t have a choice, only someone who is willing to give you that info
•there might be something about that informer that make them unreliable - filter everything they tell you, often marginalized by group, biased view/maybe unhonest, cencorsed view
Major Limitation of Non-Probability Samples
- assuming relationship beyond what sample can support
- not necessarily giving everyone a chance to get selected
- while some may be necessary, reliability + generalizability issues - can’t claim they represent characteristics of pop
- exploratory
- puposive +snowball would be valid because they focus on target pop
PROBABILITY SAMPLING
•Aim: provide reliable + valid description of pop
equal chance – better probability of generalizable results
•by collecting observations with method that ensures sample data have same variations + consistent characteristics in pop data
PROBABILITY SAMPLING
- EPSEM: based on probability theory - equal chance of being selected
- minimizes sampling bias, ensuring variations in sample reflect variations in pop within a narrow margin of error
Benefits of Randomly Selecting Observations
- minimizes sampling bias
•If you don’t return ball, then you are changing probability of being selected
•Constant 1/10 chance if you put it back
•Anything that increases/decreases chance of being selected – bias
Benefits of Randomly Selecting Observations
probability theory to estimate:
a) characteristics of pop based on much smaller sample
b) how closely characteristics in random sample represent the pop characteristics
Estimating Sample Representativeness using Probability Theory
•using observations in sample to estimate summary value of characteristic in whole population (the parameter)
Estimating Sample Representativeness using Probability Theory
•sample more representative if summary of values of characteristic in sample is closer to summary value of characteristic in pop
Estimating Sample Representativeness using Probability Theory
Increasing Samples
•More combinations possible within samples of 2 as we move closer to true mean
•Most frequent estimate would be the true mean
Estimating Sample Representativeness via Sampling Error
- Problem: don’t know exact pop value of characteristic
- Probability Theory as more and/or larger samples obtained from the pop, statistics begin to cluster around the unknown population parameter in a predictable way (The Central Limit Theorem)
Estimating Sample Representativeness via Sampling Error
- allows researchers to estimate how close sample value is to pop value through concept of Standard Errors
- Standard Error: how close values cluster
STANDARD ERROR INCREMENTS
certain proportions of sample estimates will fall within certain distance from pop parameter
•34% will fall 1 standard deviation above + below
•68% of sample estimate will fall within range of standard error
•2 standard errors – 47.5%
•95% of samples might fall within 2 standard errors
•99.9% will fall within 3 standard errors
STANDARD ERROR
- Natural tendency that most cluster around central value – normal distribution
- standard error distance will change based on value of mean + range of data, but % will stay the same
STANDARD ERROR SIZE
size of the SE changes with:
a) estimated pop parameter
b) sample size
•as sample size increases, standard error size decreases
•understanding parameter + how much precision you want, then can calculate how big sample should be
Confidence Levels and Confidence Intervals
- inference about how confident researchers are sample statistics fall within specified interval (the calculated SE) from parameter
- diff way of understanding standard error
- based on probability theory – as long as sample is random + representative – you can say estimate statistic will fall in between 1 standard error
Real World of Sampling: sampling frame
list of units representing all potential observations that will be selected from pop
▫Lists of members or employees in an organization, Phone books
Probability Sampling Methods 1. Simple Random Sampling
seldom used in practice
- Assign diff number to each element in sampling frame
- After deciding on sample size, select the units randomly using table of random numbers/random number generator
Probability Sampling Methods 2. Systematic Sampling
Every nth element chosen from frame, but with random start
1.Calculate sampling interval: pop. size/sample size
2.Select random start using number table
3.Select every nth unit based on calculated sampling interval
•if list is ordered – biased sample
Probability Sampling Methods 3. Stratified Sampling
Significantly reduces sampling error
a) grouping population on key variables
b) randomly sampling from each group
c) ensuring random sample of each group proportional to size of group
•Grouping on key variables makes units more homogeneous, sampling error lower
Probability Sampling Methods 3. Stratified Sampling
-choosing sample proportionate to diff parts of pop
•make sure 30% female, 70% female
Probability Sampling Methods 4. Cluster Sampling
Preferable when entire list of pop is impractical/impossible to compile, but is already grouped into subpopulations (students in schools; officers in police stations; individuals on city blocks)
Involves random sampling at each cluster
Probability Sampling Methods 4. Cluster Sampling
must sample in manner that makes the cases selected in each cluster proportionate to total number of cases found in cluster
stratifying sample by each level
natural groups sampled initially, members of each selected group being subsampled afterward
Probability proportionate to size (PPS) sampling
•Type of multistage cluster sample in which clusters selected, not with equal probabilities, but with probabilities proportionate to their sizes – as measured by number of units to be subsampled
Weighting
sampling whereby unit selected an unequal probabilities assigned weights in such matters to make sample representative of population from which it was selected