sampling and sampling distributions Flashcards
Step 1 in Sampling Plan: What is a sampling frame?
frames are data sources eg. population lists, directories or maps
Samples are selected from frames
What is the second step in a sampling plan?
Determining whether a nonprobability or probability sampling should be used
What is step 3 of sampling plan?
Choose type of sampling plan by first determining what sampling technique is to be used (non-probability or probability)
describe non-probability sampling
select items and individuals without knowing probabilities of selection
eg. convenience, judgmental sampling
Why may a sample be preferred over census?
Smaller budget
Shorter time to gather
Sample can be representative of population by large
less troublesome and more feasible
Lower cost of sampling errors
High cost of non sampling errors
Nature of measurement is destructive
Provides attention to individual cases
what is a census?
Study of all population features
Advantages of non-probability sampling
Convenience, speed, low cost
What are disadvantages of nonprobability sampling?
selection bias
cannot conduct statistical inference as a result
What are conditions favouring use of non probability sampling?
Nature of research is inconclusive and exploratory
Homogenous (low variability in population)
Operational conditions
What is probability sampling?
select items without knowing their probabilities of selection
What is the advantage of probability sampling?
- It is conclusive
- Make statistical inferences
- Inferences can be made about the population of interest
- They should be used whenever possible
What are conditions not favouring use of non probability sampling?
statistical conditions
non sampling errors are larger
What are conditions favouring use of probability sampling?
Nature of research is conclusive
Heterogeneous which means there is high variability in population
Statistical conditions
What are conditions that do not favour use of probability sampling?
Sampling errors are larger
Operational conditions
Step 3 Sampling Plan: What are examples of non-probability sampling?
Convenience sampling
Judgemental sampling
Snowball sampling
Quota sampling
Sampling techniques
Non-probability sampling technique: Convenience sampling
Examples
right place right time: mall intercept, students/church group members, tear-out questionnaires in magazines; taking 20 products off production line
Sampling techniques non probability sampling
Convenience sampling:
Cons
selection bias (self-selection bias when volunteer to participate)
cannot conduct statistical inference
Sample not representative of populatoin
Sampling techniques non probability sampling
Convenience sampling:
advantages
cheap and convenient, fast
Sampling techniques non probability sampling
Convenience sampling:
Define
It is a non-probability sampling method where items/ people are chosen due to their accessibility and convenience Eg. People volunteering, asking people on the street eg. Online questionnaires, volunteer call
Sampling techniques non probability sampling
Judgmental sampling
Pros
low cost
convenient
not time consuming
Sampling techniques non probability sampling
Convenience sampling:
Pros
Time and cost efficient, easy to conduct
Sampling techniques non probability sampling
Judgmental sampling
cons
The disadvantage is that because they are preselected there is selection bias, which means their opinions, or results cannot be used to generalize to the wider population.
Sampling techniques non probability sampling
Judgmental sampling
advantages
limited number of people with the expertise required to be in the sample
Sampling techniques non probability sampling
Judgmental sampling
Define
Preselected experts relevant to the subject matter and that are assumed to be representative of population of interest, are chosen by the researcher.
**sample is based on the judgment of who the researcher thinks would be best for the sample. **
- Form of convenience sampling
Step 3 Sampling Plan: what are examples of probability sampling?
simple random sampling
systematic sampling
cluster sampling
stratified sampling
Non-probability sampling
Snowball Sampling
Pro
- best when sample members need to meet a certain criterion
- Find one person who qualifies to participate, ask him or her to recommend several other people who have the knowledge/traits you are looking for, and participant list can grow from there.
Non-probability sampling
Snowball Samples
Con
Takes a lot of time. Need to wait for initial people to complete survey then wait for them to recommend and for the new recommended sample members to complete survey
Non-probability sampling
Snowball Samples
Define
preselected sample members asked to recommend or recruit new sample members they believe belong to target population of interest
In turn, new sample members added by referral and the ‘snowball’ grows. This can thought of as a type of convenience sampling because it is easier for existing sample members recommend potential sample members.
Probability sampling
Simple random sampling
Cons
- when samples are spread over large geographic area, can be expensive and time consuing
- Lower precision and larger standard errors than other probability sampling methods
- Cannot ascertain representativeness, especially if sample is small
- Not often used in market research
Probability sampling
Simple random sampling
Pros
- easy to understand
- Fast
- sample results is reflective of target population; most inference approaches assume SRS has been used
- Unbiased and representative
- When a good sampling frame exists (List of elements (people/ items) of population of interest)
- Practical in concentrated geographic regions
Probability sampling
Simple random sampling
Define
Each item/individual from population has a known and equal chance of selection.
Give each item/individual in popn a unique ID number; then generate random numbers to determine which elements to include in sample; equivalent to drawing names out of a jar
Forms basis for other techniques.First selection of a particular member from population is 1/N.
Probability sampling techniques
Systematic Sampling
what does this sampling method assume? what are the implications?
Assumes some sort of ordering of popn elements – ordering can be unrelated to characteristic of interest (names in telephone book) or directly related (outstanding balance on credit card);
- if unrelated, systematic very similar result to SRS;
- if related, increases “representativeness”
Probability sampling techniques
Systematic Sampling
compare with simple random sampling
Sim to SRS – each popn element has a known and equal prob of selection
What is systematic sampling?
Group the N items in the frame into n groups of i items. pick every ith element in succession from the sampling frame
N= population size; n= sample size
i= N/n
i is rounded to the nearest integer
Taking a systematic sample from N=800 and n=40
i= 20
that means each group contains 20 employees. Then you select a random number eg. 008. Then select every 20th individual after the first selection in the sample
disadvantages of systematic sampling
This requires a larger sample size in order to separate the frame into groups
there may be extreme selection bias in cases where there is a pattern in the frame (Certain items/ individiuals may be chosen more or less often)
Advantages of systematic sampling
- It is faster and easier than using a simple random sample
- Good approximation of random sample
- if ordering related to characteristic of interest, lowers sampling error
probability sampling
stratified sampling
The stratas are?
mutually exclusive, collectively exhaustive;
use stratification variables;
want homogeneous within stratum
heterogenous across strata;
stratification variables should be as closely related to characteristic of interest as possible; often use demographic characteristics
probability sampling
stratified sampling
number of strata?
# of strata is judgement call; experience suggests no more than 6.
probability sampling
stratified sampling
objective
– increase precision without increasing cost
probability sampling
stratified sampling
what does it mean to by how sample strata is proportionate?
Proportionate: size of sample from each stratum is proportionate to size of stratum in population
probability sampling
stratified sampling
- Subdivide the population into stratas
(subpopulations that are defined by common characteristic eg. Gender, school year)
- simple random sample is selected from strata and combined with separate simple random sample results
- May be proportionate
Eg. 25% of overall sample from first stratum and 75% of overall sample from second stratum
advantages of stratified sampling
cons
Difficult to select relevant stratification variables, reliant on sampling frame to contain comprehensive information about population
Not feasible to stratify on many variables
Expensive
advantages of stratified sampling
Homogeneity of items in each stratum gives more certainty in population parameter estimations
includes all important sub-pouplations
- since random sample is taken from each strata, more likely to be representative of population
advantages of cluster sampling
- cheaper than simple random sampling especially when population is spread out in large geographic area
- cheaper than stratified sampling
- easy to implement (often population is arranged as clusters)
disadvantages of cluster sampling
requires larger sample size to achieve same precision from simple random sampling and stratified sampling otherwise it would be imprecise
if clusters are different to each other (more objects in one cluster), then it can lead to bias and not be representative of population of interest
Examples of cluster sampling
Common example – area sampling (county, block)
what is a cluster sample?
- population is often naturally divided into clusters eg. Countries, city blocks, sale territories
2. Take SRS of one or more clusters (in contrast with stratified where you select sample from each strata)
Non-probability sampling
What are the two stages of quota sampling?
Stage 1: setting quotas. quota corresponds to demographics of population. quota can be made for several demographic categories eg. age, race, income level. and researchers will look for those people. more categories she specifies quotas for –> complex, time consuming, costly
Stage 2: select sample elements by convenience/judgment; once quotas assigned, freedom of choice of elements as long as they fit control characteristics
disadvantages of quota sampling
subjective views/ selection bias
sample error unable to be assessed
Cannot ensure how representative it is of population of interest
advantages of quota sampling
cost effective
no sampling frame required
Sample can be controlled for certain characteristics
What is a sampling error
variation (chance differences) between samples due to sampling selection
margin of error is the sampling error
What is a way to reduce sampling error?
Having larger samples.
What are non sampling errors?
Coverage error, non response error, measurement error
Cannot be reduced by larger samples
Types of Error
What is coverage error?
An error when the frame excludes certain items so that they have no chance of being selected in the sample.
Leads to selection bias where random sample provides estimates of characteristics of frame rather than population
How can coverage error be reduced?
Adequate sampling frame i.e up to date list of all items to be included in the sample
What is non response error?
when sample members do not respond.
Upper and lower class respond less frequently to surveys than those in middle class
Cannot assume non responders will respond the exact same as responders
What does non-response error cause?
When there is non-response error, there is non-response bias because it cannot be assumed non-responders would respond the same
Types of Error
How to reduce occurrence of sampling errors?
Taking larger samples. However, this can have a greater cost
Types of Errors
What are three types of measurement errors?
Ambiguous wording of question
Hawthorne Effect
Respondent Error
Types of Errors
Sources of Measurement Error
How can ambiguous wording of questions be a measurement error?
Can cause misunderstanding and incorrect responses may be provided leading to poor inferences from results
Types of Errors
Sources of Measurement Error
What is the hawthorne effect?
Respondent’s responses are consistent to those that are socially responsible or what is expected from the survey as a result of knowledge that they are part of the survey
How to reduce Hawthorne Effect?
Train interviewers
Types of Errors
Sources of Measurement Error
What is respondent error?
When there are unusual, odd answers responses
Types of Errors
Sources of Measurement Error
What are ways to minimize respondent errors?
- Screening responses and contacting respondents who provided unusual responses
- Re-contacting randomly chosen respondents to ascertain reliability of responses
How can a coverage error be an ethical issue?
Purposely excluding certain groups/individuals from the frame so that suvey results are more favourable
How can Mortality rate/ non response error be an ethical issue?
Survey designed in a way that certain individuals/ groups are less likely to respond than others i.e an implicit way of exclusion
How can sampling error be an ethical issue?
Findings shown without reference to survey size and margin of error
the sponsor promotes from their perspective
How can a measurement error become an ethical issue?
Interviewer intentionally creates Hawthorne effect or gives hints to respondent on the way they should respond
interviewer creates questions in a way that encourages favourable response
Respondent purposely provides false responses
Why may a sampling plan be required?
To ensure sample is representative of full population
What is a way to ensure through the sampling plan, the sample is representative of full population?
Avoiding self-selected samples as those participants may show more interest than participants randomly selected from the population at large. This can cause sample bias
What does each column in minitab represent?
Each column represents a variable
how do you handle non response errors?
persuade individuals to complete survey.
Follow up on non responders after period of time. follow up responses are compared with initial responses for the purposes of analysing and interpreting survey reponses
what are measurement errors?
errors that arise as a result of the survey design
what is the purpose of statisical inference?
to make conclusions about population from findings conducted on sample
what is used to estimate the population mean?
the sample mean
what is the sampling distribution of the mean?
distribution of all possible sample means if you select all possible samples of a given size
what is the population mean?

what is the population standard deviation?

what is the standard error of the mean?
how sample means vary from sample to sample
when sample size increases, standard error of the mean decreases
sample means less variable than individual values in population (sample mean is closer to population mean than individual value as averaging process dilutes importance of individual value)

describe how sampling from normally distributed populations will effect the sample mean.
the sample mean would be very similar to the population mean
how do larger samples affect the standard error of the mean?
reduce the standard error of the mean. that is, there is less variability in the sample means from sample to sample
z for the sampling distribution of the mean

what is the central limit theorem/
as the sample size gets large enough (at least 30), the more closely **the distribution of the sample mean will appear more like a normal distribution **
if population is normally distributed, sampling distribution of mean will be normally distributed regardless of sample size
most population distributions, regardless of shape, the sampling distribtion of mean will be normally distributed if sample size is at least 30
population distribution is symmetrical, sampling distribution for mean will be approximately ormal for samples as small as 5
what is the size of the sampling error based on?
sampling error is the variation that occurs due to selecting single sample from population
the amount of variation in the population
sample size
what does sampling distribution of the proportion show?
the propoertion of items belonging to one of the categories
eg. proportion of customers that prefer your brand
what is the proportion of items in teh sample with the characteristic of interest represented by?
p
what does selecting sample depend on?
nature of population
resources available: time and cost
what does a sample need to be?
unbiased and representative.