DATA COLLECTION TERMINOLOGY Flashcards
POPULATION
The entire set of items (sampling units) in the group being studied.
CENSUS
Measuring every member of a population.
+ No uncertainty due to natural variation
∴ Accurate
- Time consuming
- Expensive
- Some testing destroys the item
SAMPLING FRAME
List of all sampling units.
SIMPLE RANDOM SAMPLING
Equal chance of being selected, use random number generator alongside sampling frame.
+ Bias free
- Sampling frame required ∴ administration required
SYSTEMATIC SAMPLING
Takes every kth unit, pick random number between 1 and k for start point.
+ Easy to use
- Sampling frame required ∴ administration required
STRATIFIED SAMPLING
The proportion of each strata in the sample should accurately reflect the population structure. Use either simple random or systematic sampling to fill the groups.
+ Reflects the population structure
- Need distinct strata for population
OPPORTUNITY SAMPLING
Non-random sample based on who/what is available.
+ Easy
+ Cheap
- Unlikely to be representative
QUOTA SAMPLING
Non-random sample starts with quotas to be filled. Groups filled using opportunity sampling. Ignore any items of a type where the quota is full. If replacing the items, tag each one so it is not re-recorded.
+ No samping frame needed ∴ administration relatively easy
- Not random, potential interviewer bias
QUALITATIVE
Non-numerical data
QUANTITATIVE
Numerical
DISCRETE
Can only take certain values
CONTINUOUS
Can take any value in a range, must be grouped
MEDIAN
The middle value when the data is in order of size. If there is more data above or below the proposed median, then it increases or decreases accordingly. It is not affected by extreme values.
MEAN
The sum of all the values divided by the number of values in the data set. If the sum of the values increases or decreases, then the mean increases or decreases accordingly. It uses all the data values, so can be affected by extreme values.
SAMPLING UNIT
Individual member or element of the population.
SAMPLE
A selection of observations taken from a subset of the population to find out information about the population as a whole.
+ Quick
+ Easy
+ Quality of info about each sampling unit is often better
- Uncertainty due to natural variation
∴ less accurate