Data Collection Key Terms (C1-3) [Stats] Flashcards
Population
Entire set of items (sampling units) in the group being studied.
Census
Measuring every member of a population
Evaluation - Census
+ Accurate
- Expensive
- Some testing destroys the item
Sampling frame
List of sampling units
(It is not always possible to create this, thus can be a disadvantage of some techniques)
Simple Random Sampling
Equal chance of being selected - done using random number generator alongside sampling frame.
Type of RANDOM Sampling
Evaluation - Simple Random Sampling
+ Bias-free
- Sampling frame required
Systematic Sampling
Taking every k^th unit, pick random number between 1 and k for start point
Type of RANDOM Sampling
Evaluation - Systematic Sampling
+ Quick to use
- Sampling frame required
Stratified Sampling
Proportionally representative strata (groups) in the same to reflect the population.
(use either simple random/systematic to fill groups)
Type of RANDOM Sampling
Evaluation - Stratified Sampling
+ Reflects Population
- Need clear strata (groups) for population
Opportunity Sampling
Sample based on who/what is available.
Type of NON-RANDOM Sampling
Evaluation - Opportunity Sampling
+ Easy, cheap
- Unlikely to be representative
Quota Sampling
Starts with quotas (groups) to be filled, which are not necessarily representative of the population. Quotas (groups) filled using opportunity sampling.
Similar to stratified sampling, like a variation of opportunity sampling
Type of NON-RANDOM Sampling
Evaluation - Quota Sampling
+ No sampling frame needed
- Not random, potential bias
Data Types
Qualitative: Non-numerical
Quantitative: Numerical
Discrete: Can only take certain values (often integers) => e.g. shoe size
Continuous: Can take any value in a range, must be grouped. => e.g. foot length
Median (Location)
LQ: n/4 th term
Median: n/2 th term
UQ: 3n/4 th term
xth percentile = x/100 n th term
Mean (Location)
Line over x
Sum of (Frequency x no. of __)
/ Sum of frequencies
Variance (Spread) σ^2
(Sum of frequency x x^2 / sum of frequencies) - mean^2
MSMSM
Mean of the Squares Minus Square of the Mean
(Also = Sxx / n)
Coding
If y = ax + b…
then mean of y
= a(mean of x) + b
AND
σ of y = a x (σ of x)
Linear Interpolation
Using the assumption that all data values are evenly spread throughout each class, using proportion to find how far through each class the data value should be.
Remember to add on the lower-class boundary after finding the correct data value.