Statistics Theory L5 = Statistical Sampling Flashcards
Goals of statistical sampling? (3)
- Gather information via observational study (could also be used in experiments by sampling of experimental units).
- Collect representative data, which allows us to make inferences about the intended statistical population (target population).
- Make reliable inferences (i.e., to avoid bias & get adequate precision).
Again, we refer to the diagram that illustrates the sample vs the population?
The Population-Sample-Direction-of-Inference diagram.
Egs of parameters of interest we might want to estimate? (5)
- Animal density in a nature reserve.
- Average height of students at Wits.
- Average circumference of trees in a plantation.
- The slope between two variables X and Y.
- A measure of uncertainty (SE and 95% CI).
For most of the work we do in the environmental sciences a census is generally not possible, so what do we need to get reliable inferences, avoid bias, etc? (2)
- Probabilistic sample.
- Sampling frame.
Probabilistic sample?
= selection of a sample of units based on some random mechanism.
Probabilistic sample attribute?
Haphazard, opportunistic, judgement sampling can be highly biased.
Goal of a Probabilistic sample?
To avoid bias selection of the units (as it leads to a biased estimate of parameters).
Eg of a Probabilistic sample?
Wits’ students economic status.
- Solution to it being haphazard and stuff is to get a list from the university registrar - this list is the sampling frame.
Sampling frame?
= a list of all sample units in a statistical population.
Sampling frame attributes? (2)
- In spatial sampling, one could randomly choose x and y coordinates.
- Every sampling unit has some chance of being selected.
Types of sampling designs? (5)
- Simple random sampling.
- Stratified random sampling.
- Systematic sampling.
- Cluster sampling.
- Double sampling.
Simple random sampling attributes? (5)
- We select n units from a population of N.
- Each unit has the same probability of being selected.
- Selection of each unit is independent.
- Sampling without replacement (SWOR) produces more precise estimates.
- Good to use when the attribute of interest is homogeneous.
Details of Simple random sampling? (5)
- N is assumed to be finite.
- Possible to locate & identify each sampling unit & measure variables of interest (measurement error must be much smaller than the sampling error).
- Sampling frame consists of distinct, non-overlapping sample units (has to do with the fact that each sampling unit is independent).
- Sampling units (eg, plots) can be different sizes, but they add variability & complexity to analysis.
- If possible, sample without replacement, as it produces more precise estimates.
N?
= total number of units in a population.
Thing to note about random sampling?
Can sometimes produce a clumped or patchy distribution of sampling units.
Why use Simple random sampling?
Stratified random sampling attributes? (5)
- We designate homogeneous strata from the sampling frame.
- Then we spread the sampling effort between the strata.
- We can treat the strata as domains of study (eg, to compare between them).
- Sample & generate estimates by stratum, but then combine estimates with an overall measure of uncertainty/precision.
- Good option if variability within the strata < the variability between the strata (provides more precise estimate).-
Stratified random sampling design uses a number of ways to allocate sample units among strata, what are they? (3)
- Proportional to size.
- Proportional to variability.
- Based on economic or logistical considerations.
Goal of Stratified random sampling design?
To improve precision through optimal allocation of sampling effort.
Why use Stratified random sampling?
Systematic sampling?
= we select sample units at regular intervals after a random start.
Systematic sampling attributes? (4)
- Each transect/plot is a sampling unit.
- Done to reduce bias.
- The mathematics is more complicated, but usually precision is better.
- A potential problem is if the arrangement of sample units coincide with an unknown cyclic pattern.
Egs of Systematic sampling? (2)
- A plot or transect every 50m.
- We sample every kth person in a list.
Why use Systematic sampling design?
To have a well-spread out sample.