Statistics Theory L5 = Statistical Sampling Flashcards

1
Q

Goals of statistical sampling? (3)

A
  • Gather information via observational study (could also be used in experiments by sampling of experimental units).
  • Collect representative data, which allows us to make inferences about the intended statistical population (target population).
  • Make reliable inferences (i.e., to avoid bias & get adequate precision).
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Again, we refer to the diagram that illustrates the sample vs the population?

A

The Population-Sample-Direction-of-Inference diagram.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Egs of parameters of interest we might want to estimate? (5)

A
  • Animal density in a nature reserve.
  • Average height of students at Wits.
  • Average circumference of trees in a plantation.
  • The slope between two variables X and Y.
  • A measure of uncertainty (SE and 95% CI).
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

For most of the work we do in the environmental sciences a census is generally not possible, so what do we need to get reliable inferences, avoid bias, etc? (2)

A
  • Probabilistic sample.
  • Sampling frame.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Probabilistic sample?

A

= selection of a sample of units based on some random mechanism.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Probabilistic sample attribute?

A

Haphazard, opportunistic, judgement sampling can be highly biased.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Goal of a Probabilistic sample?

A

To avoid bias selection of the units (as it leads to a biased estimate of parameters).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Eg of a Probabilistic sample?

A

Wits’ students economic status.

  • Solution to it being haphazard and stuff is to get a list from the university registrar - this list is the sampling frame.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Sampling frame?

A

= a list of all sample units in a statistical population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Sampling frame attributes? (2)

A
  • In spatial sampling, one could randomly choose x and y coordinates.
  • Every sampling unit has some chance of being selected.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Types of sampling designs? (5)

A
  • Simple random sampling.
  • Stratified random sampling.
  • Systematic sampling.
  • Cluster sampling.
  • Double sampling.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Simple random sampling attributes? (5)

A
  • We select n units from a population of N.
  • Each unit has the same probability of being selected.
  • Selection of each unit is independent.
  • Sampling without replacement (SWOR) produces more precise estimates.
  • Good to use when the attribute of interest is homogeneous.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Details of Simple random sampling? (5)

A
  • N is assumed to be finite.
  • Possible to locate & identify each sampling unit & measure variables of interest (measurement error must be much smaller than the sampling error).
  • Sampling frame consists of distinct, non-overlapping sample units (has to do with the fact that each sampling unit is independent).
  • Sampling units (eg, plots) can be different sizes, but they add variability & complexity to analysis.
  • If possible, sample without replacement, as it produces more precise estimates.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

N?

A

= total number of units in a population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Thing to note about random sampling?

A

Can sometimes produce a clumped or patchy distribution of sampling units.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Why use Simple random sampling?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Stratified random sampling attributes? (5)

A
  • We designate homogeneous strata from the sampling frame.
  • Then we spread the sampling effort between the strata.
  • We can treat the strata as domains of study (eg, to compare between them).
  • Sample & generate estimates by stratum, but then combine estimates with an overall measure of uncertainty/precision.
  • Good option if variability within the strata < the variability between the strata (provides more precise estimate).-
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Stratified random sampling design uses a number of ways to allocate sample units among strata, what are they? (3)

A
  • Proportional to size.
  • Proportional to variability.
  • Based on economic or logistical considerations.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Goal of Stratified random sampling design?

A

To improve precision through optimal allocation of sampling effort.

20
Q

Why use Stratified random sampling?

21
Q

Systematic sampling?

A

= we select sample units at regular intervals after a random start.

22
Q

Systematic sampling attributes? (4)

A
  • Each transect/plot is a sampling unit.
  • Done to reduce bias.
  • The mathematics is more complicated, but usually precision is better.
  • A potential problem is if the arrangement of sample units coincide with an unknown cyclic pattern.
23
Q

Egs of Systematic sampling? (2)

A
  • A plot or transect every 50m.
  • We sample every kth person in a list.
24
Q

Why use Systematic sampling design?

A

To have a well-spread out sample.

25
Q

Cluster sampling attributes? (4)

A
  • No sampling frame available for individuals but there is for groups of individuals (eg, vegetation patches rather than individual plants).
  • Each sample unit is a collection/cluster of individual elements.
  • Good if variation within patches > variation between patches (provides more precision estimates).
  • Consists of various stages.
26
Q

Stages of Cluster sampling? (3)

A
  • 1-stage sampling.
  • 2-stage sampling.
  • Multi-stage cluster sampling.
27
Q

1-stage sampling?

A

= we randomly select clusters & then measure all elements within them.

28
Q

2-stage sampling?

A

= involves the random selection of clusters & random sample of elements within the clusters.

29
Q

Why use Cluster sampling?

A

I think we use it (1) if the variation within the patches is more than the variation between the patches & (2) if there is no sampling frame for individuals but groups.

30
Q

Double sampling?

A

= where we work with two measurements that are correlated, one that is easy to measure & a second that is harder to measure but is of more interest to the scientist.

31
Q

Double sampling attributes? (4)

A
  • One set of sample unit consists of easy0to-conduct measurements & we can do many of them.
  • A second smaller set consists of sample units where we do the easy-to-conduct measurement & we do the more difficult measurement that we’re actually interested in.
  • We estimate a relationship between the 2 measurements using something like regression.
  • We use the regression & the measurements from the 1st set of sample units to estimate the 2nd quantity of interest.
32
Q

Eg of Double sampling?

A

We’re interested in grass biomass & how it varies across the veld, but it’s difficult to measure accurately. Grass height is easier to measure & it’s correlated with biomass.

33
Q

So, explain the Double sampling eg? (4)

A
  • Step 1: Get the 1st set of samples, which is grass height (n = lots).
  • Step 2: Get the 2nd set of samples, which are grass height & clipped dry biomass (n = few).
  • Step 3: Estimate the relationship in the 2nd set of samples (regression).
  • Step 4: Calculate biomass for samples in 1st set of samples.
34
Q

Why use Doubling sampling?

A

I think we can use it when our attribute of interest is difficult to measure but is correlated to a variable that is easy to measure.

35
Q

2 Elements to consider when planning field logistics?

A
  • Plot size.
  • Plot shape.
36
Q

What does the plot size & plot shape depend on? (2)

A
  • The objectives of the study.
  • The nature of the data/study system.
37
Q

Thing to note when deciding which plot size or plot shape to use?

A

It might require experimenting with different plot sizes & plot shapes to find an optimal size or shape.

38
Q

Criteria for deciding on plot size & shape? (3)

A
  • Statistical procedure that gives the best precision given the cost & area of sampling.
  • Ecologically, consider the best efficiency to achieve the objectives of the study.
  • Logistical greatest ease to implement in the field.
39
Q

4 Factors that influence plot shape?

A
  • Detection of individuals.
  • Distribution of individuals.
  • Edge effects (i.e., knowing where the subject is with respect to the plot boundary).
  • Data collection methods.
40
Q

Types of plot shapes? (3)

A
  • Long, narrow plots.
  • Square or circular plots.
  • Narrow, rectangular plots.
41
Q

Long, narrow plots attributes? (4)

A
  • Easy to lay out.
  • Have a lot of perimeter with respect to plot area.
  • More edge effects.
  • Problems with identifying whether animals are inside or outside plots.
42
Q

Square or circular plots attribute?

A

Have fewer problems with edge effects.

43
Q

Narrow, rectangular plots attributes? (3)

A
  • Increased detection of study subjects.
  • Increase chance of intersecting clumps/clusters.
  • Best for clusters (better precision in vegetation studies).
44
Q

Thing to note when you have a fixed budget?

A

There is a trade-off between the number of plots & plot size.

45
Q

For instance, what to do if you have a homogeneous population?

A

Use few large plots, as they will capture variability in the population.

46
Q

What to do if you have a high degree of heterogeneity?

A

A large number of smaller plots might be necessary.

  • If plots are too small, there will be too many zeros, leading to poor precision (high SE).