Statistics Theory L5 = Statistical Sampling Flashcards

1
Q

Goals of statistical sampling? (3)

A
  • Gather information via observational study (could also be used in experiments by sampling of experimental units).
  • Collect representative data, which allows us to make inferences about the intended statistical population (target population).
  • Make reliable inferences (i.e., to avoid bias & get adequate precision).
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Again, we refer to the diagram that illustrates the sample vs the population?

A

The Population-Sample-Direction-of-Inference diagram.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Egs of parameters of interest we might want to estimate? (5)

A
  • Animal density in a nature reserve.
  • Average height of students at Wits.
  • Average circumference of trees in a plantation.
  • The slope between two variables X and Y.
  • A measure of uncertainty (SE and 95% CI).
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

For most of the work we do in the environmental sciences a census is generally not possible, so what do we need to get reliable inferences, avoid bias, etc? (2)

A
  • Probabilistic sample.
  • Sampling frame.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Probabilistic sample?

A

= selection of a sample of units based on some random mechanism.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Probabilistic sample attribute?

A

Haphazard, opportunistic, judgement sampling can be highly biased.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Goal of a Probabilistic sample?

A

To avoid bias selection of the units (as it leads to a biased estimate of parameters).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Eg of a Probabilistic sample?

A

Wits’ students economic status.

  • Solution to it being haphazard and stuff is to get a list from the university registrar - this list is the sampling frame.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Sampling frame?

A

= a list of all sample units in a statistical population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Sampling frame attributes? (2)

A
  • In spatial sampling, one could randomly choose x and y coordinates.
  • Every sampling unit has some chance of being selected.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Types of sampling designs? (5)

A
  • Simple random sampling.
  • Stratified random sampling.
  • Systematic sampling.
  • Cluster sampling.
  • Double sampling.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Simple random sampling attributes? (5)

A
  • We select n units from a population of N.
  • Each unit has the same probability of being selected.
  • Selection of each unit is independent.
  • Sampling without replacement (SWOR) produces more precise estimates.
  • Good to use when the attribute of interest is homogeneous.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Details of Simple random sampling? (5)

A
  • N is assumed to be finite.
  • Possible to locate & identify each sampling unit & measure variables of interest (measurement error must be much smaller than the sampling error).
  • Sampling frame consists of distinct, non-overlapping sample units (has to do with the fact that each sampling unit is independent).
  • Sampling units (eg, plots) can be different sizes, but they add variability & complexity to analysis.
  • If possible, sample without replacement, as it produces more precise estimates.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

N?

A

= total number of units in a population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Thing to note about random sampling?

A

Can sometimes produce a clumped or patchy distribution of sampling units.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Why use Simple random sampling?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Stratified random sampling attributes? (5)

A
  • We designate homogeneous strata from the sampling frame.
  • Then we spread the sampling effort between the strata.
  • We can treat the strata as domains of study (eg, to compare between them).
  • Sample & generate estimates by stratum, but then combine estimates with an overall measure of uncertainty/precision.
  • Good option if variability within the strata < the variability between the strata (provides more precise estimate).-
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Stratified random sampling design uses a number of ways to allocate sample units among strata, what are they? (3)

A
  • Proportional to size.
  • Proportional to variability.
  • Based on economic or logistical considerations.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Goal of Stratified random sampling design?

A

To improve precision through optimal allocation of sampling effort.

20
Q

Why use Stratified random sampling?

21
Q

Systematic sampling?

A

= we select sample units at regular intervals after a random start.

22
Q

Systematic sampling attributes? (4)

A
  • Each transect/plot is a sampling unit.
  • Done to reduce bias.
  • The mathematics is more complicated, but usually precision is better.
  • A potential problem is if the arrangement of sample units coincide with an unknown cyclic pattern.
23
Q

Egs of Systematic sampling? (2)

A
  • A plot or transect every 50m.
  • We sample every kth person in a list.
24
Q

Why use Systematic sampling design?

A

To have a well-spread out sample.

25
Cluster sampling attributes? (4)
- No sampling frame available for individuals but there is for groups of individuals (eg, vegetation patches rather than individual plants). - Each sample unit is a collection/cluster of individual elements. - Good if variation within patches > variation between patches (provides more precision estimates). - Consists of various stages.
26
Stages of Cluster sampling? (3)
- 1-stage sampling. - 2-stage sampling. - Multi-stage cluster sampling.
27
1-stage sampling?
= we randomly select clusters & then measure all elements within them.
28
2-stage sampling?
= involves the random selection of clusters & random sample of elements within the clusters.
29
Why use Cluster sampling?
I think we use it (1) if the variation within the patches is more than the variation between the patches & (2) if there is no sampling frame for individuals but groups.
30
Double sampling?
= where we work with two measurements that are correlated, one that is easy to measure & a second that is harder to measure but is of more interest to the scientist.
31
Double sampling attributes? (4)
- One set of sample unit consists of easy0to-conduct measurements & we can do many of them. - A second smaller set consists of sample units where we do the easy-to-conduct measurement & we do the more difficult measurement that we're actually interested in. - We estimate a relationship between the 2 measurements using something like regression. - We use the regression & the measurements from the 1st set of sample units to estimate the 2nd quantity of interest.
32
Eg of Double sampling?
We're interested in grass biomass & how it varies across the veld, but it's difficult to measure accurately. Grass height is easier to measure & it's correlated with biomass.
33
So, explain the Double sampling eg? (4)
- Step 1: Get the 1st set of samples, which is grass height (n = lots). - Step 2: Get the 2nd set of samples, which are grass height & clipped dry biomass (n = few). - Step 3: Estimate the relationship in the 2nd set of samples (regression). - Step 4: Calculate biomass for samples in 1st set of samples.
34
Why use Doubling sampling?
I think we can use it when our attribute of interest is difficult to measure but is correlated to a variable that is easy to measure.
35
2 Elements to consider when planning field logistics?
- Plot size. - Plot shape.
36
What does the plot size & plot shape depend on? (2)
- The objectives of the study. - The nature of the data/study system.
37
Thing to note when deciding which plot size or plot shape to use?
It might require experimenting with different plot sizes & plot shapes to find an optimal size or shape.
38
Criteria for deciding on plot size & shape? (3)
- Statistical procedure that gives the best precision given the cost & area of sampling. - Ecologically, consider the best efficiency to achieve the objectives of the study. - Logistical greatest ease to implement in the field.
39
4 Factors that influence plot shape?
- Detection of individuals. - Distribution of individuals. - Edge effects (i.e., knowing where the subject is with respect to the plot boundary). - Data collection methods.
40
Types of plot shapes? (3)
- Long, narrow plots. - Square or circular plots. - Narrow, rectangular plots.
41
Long, narrow plots attributes? (4)
- Easy to lay out. - Have a lot of perimeter with respect to plot area. - More edge effects. - Problems with identifying whether animals are inside or outside plots.
42
Square or circular plots attribute?
Have fewer problems with edge effects.
43
Narrow, rectangular plots attributes? (3)
- Increased detection of study subjects. - Increase chance of intersecting clumps/clusters. - Best for clusters (better precision in vegetation studies).
44
Thing to note when you have a fixed budget?
There is a trade-off between the number of plots & plot size.
45
For instance, what to do if you have a homogeneous population?
Use few large plots, as they will capture variability in the population.
46
What to do if you have a high degree of heterogeneity?
A large number of smaller plots might be necessary. * If plots are too small, there will be too many zeros, leading to poor precision (high SE).