Week 8 - Sampling w/ unequal probabilities Flashcards

1
Q

For stratified sampling, which allocation method results in an equal probability of selection?

A

Proportional allocation only.
Equal, Neyman and optimal allocation do not.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Sometimes estimates can be improved by unequal probabilities. What are the 3 main reasons for unequal probabilities?

A
  1. Disproportionate stratification
  2. Multi-stage sampling
  3. Probability proportional to size (PPS) sampling
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Disproportionate stratification

A
  1. Size of sample drawn from a particular stratum is NOT proportional to the relative size of that stratum
  2. 2 or more strata will have DIFF. SAMPLING FRACTIONS, f=n/N
  3. May be desirable b/c want to achieve precision requirements (which would not be achieved by equal prob. sampling b/c of the small size of some parts)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Multi-stage cluster sampling + 1 example

[5m, 2019]

*may be desirable b/c want to achieve precision requirements (which would not be achieved by equal prob. sampling b/c of the small size of some parts)

A
  1. We take a SRS of clusters
  2. then a SRS of elements within those clusters
  3. then a SRS of smaller elements within those elements
  4. and so on until the final sample elements are reached.

Prob. of selection of person k in household j in cluster i

πijk = n/N * mi/Mi * lij/Lij

eg. 4-stage cluster sample. We may seek a sample of European children for educational testing.
- Take a SRS of countries.
- Take a SRS of education authorities (or counties) in each country.
- Take a SRS of schools in each authority or county.
- Take a SRS of children in each school.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Explain what is meant by probability proportional to size (PPS) sampling and explain briefly the purpose of its use. [6m, 2017]

*gives unbiased estimation also

A
  1. The PPS sampling scheme can be employed when {sampling} units VARY BY SIZE, which is measured by z, and the y variables of main concern are roughly PROPORTIONAL to z. [3m]
  2. PPS sampling scheme improves PRECISION compared to SRS
    - by giving larger units a greater chance of INCLUSION in the survey
  3. Can also have PPS with replacement sampling, which is easy to implement, or w/o replacement sampling , which can be done in various ways.
    [3m]
  • the probability of inclusion in the sample will be proportional to size,
  • so a village of 1500 residents will have 1/100th the chance of selection of a town of 150000.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

PPS sampling with replacement: Hansen-Hurwitz estimator (unbiased)

A

t(pps) hat = 1/n * summation for n (yi/πi)

ybar(pps) hat = t(pps) hat / N

Var(t(pps) hat) hat = 1/n(n-1) * summation for n (yi/πi - t(pps) hat)^2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

PPS sampling without replacement: Horvitz-Thomspon estimator (unbiased)

A

t(pps wor) hat = summation for n (yi/πi)

Complicated to estimate std error, usually use PPS w/ replacement as an approximation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Steps for selecting a PPS sample

A
  1. Find the total of zi, the size of the N units in pop, & the values of probability πi based on proportionality
    - b/c Let tz = summation of N (zi) and Let πi = zi/tz
  2. Generate RANDOM INTEGERS between 1 and the total
  3. Determine the corresponding yi of the random no.s drawn

{I guess for selecting sample alone we don’t actually need πi but it’s needed for estimating the mean and total}

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Explain clearly why sampling with probability proportional to size may sometimes be preferred to simple random sampling.

[3m, 2012]

A
  1. Say we want to estimate the total of the Y -values.
  2. If larger units tend to have larger Y values, it makes sense to assign a higher probability to sampling them, as they contribute more to the total.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Give an intuitive reason that Cov(Yi, Yj) is negative.

[2m, 2012]

A
  • If 1 observation is larger than the mean, the remaining ones are on average smaller than the mean,
  • since we sample W/O replacement (and vice versa), hence a negative covariance.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly