Sampling Flashcards
Secondary data
Data already exists. Different statistics, registers and data bases produced and maintained by public organisations and authorities are available for research
Preliminary data
Collected by resercher data, may usually be collected by survey, interviews, observations
Survey research
Self-administered questionnaires
Interviews (structured)
Respondent and interviewer talk face-to-face or on telephone or online
Observation
It records actions as they occur by monitoring people, actions or situations
experimentation
the researcher manipulates selected independent variable and
measures the effects of these manipulations on the dependent variables
Sampling methods
Probability and non-probability
Probability samples
are samples in which the elements being included have a known chance of being selected
- Simple random sampling
- Systematic random sampling
- Stratified sampling
- Cluster sampling
- Sequence sampling
Non-probability samples
are ones in which participants are selected in a purposeful way
-Judgment sampling
-Quota sampling
Simple random sampling
every element if the population has the same probability of being selected to the sample
- elements in the whole population are numbered and selected by using random numbers
- suitable for homogenous populations
N
population
n
sample
Systematic random sampling
Sampling frame: a list of the population
- The sampling units are chosen from the sampling frame at a uniform interval at a
specified rate
- Sampling interval k = N/n (N = size of the population, n=sample size
- The starting point is selected from the first interval and the very kth element is selected
- For example: N = 200, n = 10 → k = 200/10 = 20
Stratified sampling
dividing the population into mutually exclusive strata/groups (based on
nationality, profession, gender….)
- Each element can be included only in one strata
- Sample is drawn randomly from each strata/group
Cluster sampling
population consists of mutually exclusive groups called clusters (e.g. municipalities, towns, postal code areas….)
- - each cluster represents the whole population
- random clusters are selected to the sample
- -selected clusters are included fully or randoms samples are selected from those
clusters
Sequence sampling
elements are picked up sequentially until the results do not change anymore
used in quality control
Judgement sampling
Relies on sound judgement or expertise.
- It depends on selecting elements that are believed to be typical or representative
of the population
- Requires knowledge of the topic and population
- Results should be interpreted with careful consideration
- Used typically in preliminary investigations, questionnaire testing or to generate
ideas, points of view or developing hypothesis
Quota sampling
The first step is to estimate the sizes of the various subclasses or strata in the population.
- The relevant strata to the study have to be specified
- E.g. based on demographics like age, gender, family status, socioeconomic
group…
- -sampling continues until each quota is full
Even quota
Proportional quota
Optimal qouta
Even quota:
the same number of elements is picked from each strata (e.g. 100
male and 100 female)
Proportional quota
(e.g. if in population 60% male and 40% female, the sample
is drawn in the same proportions: 60 male + 40 female = total sample size 100)
Optimal quota:
- large size or large variation - > more
- high sampling costs -> less
Convenience sampling
- there is no sample design
- what is convenient or easy from the point of view of the researcher
- the researcher is not drawing the sample, the participant are self-selecting
- For example in a student research:
o meeting students on campus
o leaving questionnaires in the lobby
o posting a questionnaire link to the student web site - Risk: biased sample which does not represent the whole population
SAMPLE SIZE
- Sample size is affected by the desired accuracy of the results, time and money
- Sometimes the samples size is increased until the results are accurate enough: sequence sampling
- Populations under 300 are fully investigated
- National research sample sizes often 1000-2000, local 150-300
- Every group to be analyzed should include at least 30
- Outliers or extreme cases distort the results, if the sample size is small
- Accuracy of the results increases in proportion to the square root of the sample
size
Confidence of mean
Using this formula we can calculate how much sample volume we need to get the right confidence level.
We are given Critical value (Zα/2) - depending on how confident we want to be in the accuracy of the result.
The more confident we want to be, the larger the sample should be.
Independent Variable
Can be measured without relying on other variables.
Example: A person’s weight.
Dependent Variable:
Requires information about independent variables.
Example: BMI, which depends on both weight and height.
Constant Variables
Constant: A characteristic that does not change across individuals in the study.
Example: In a study on students, their status as “students” is a constant.
Primary Data:
Collected for a specific research project.
Example: Using surveys to gather student feedback.
sampling techniques
used to select representative groups from a population for research purposes.
Secondary Data:
Pre-existing data collected for other purposes.
Example: Data from Statistics Finland.