Week 2 Flashcards
Target population
Finite population of units about which we need information. Units may be individuals.
eg. all persons aged 18-24 living in England
Sampled/study population
The population we intend to study
eg. all persons aged 18-24 living in England with a mobile phone
Sampling frame
A list of sampling units (/population elements) from which sample is selected
eg. electoral register, postcode address file
Population characteristic
Aggregate feature of population, which is a function of the values taken by one (or more) variables for different units in population.
eg. the PROPORTION of smokers in the population of persons aged 18-24; AVERAGE satisfaction with teaching among LSE students; TOTAL expenditures on leisure activities by households in the UK.
Observation unit
An object on which a measurement is taken.
This is the basic unit of observation, called an element. (e.g. individuals, households, e.g. average household income)
Sampling unit
The unit we actually sample. We may want to interview individuals but we do not have a list of individuals of our target population. Instead, we sample households first (sampling units) and then interview the individuals living in a household (observation unit). e.g. individuals sampled within addresses sampled within areas.
How to do simple random sampling (SRS)?
- List all the population units & assign a unique ID to each. 1, 2, …, N
- Use a random number generator to generate n random numbers
- Draw the n units from list
Every possible sample will have the same probability of being selected
How to do systematic sampling?
+ 1 circumstance where a systematic sample may lead to an imprecise estimation
- Define the set of all the population units, count number N & assign a unique ID to each. 1, 2, …, N
-> List units in some order; don’t want this to display periodicity w.r.t. variable of interest
// if the variable of interest displays CYCLICAL behaviour for every kth university on the list - Compute the step, k=N/n (integer)
- greatest integer less than or = N/n - Generate a random number R between 1 and k units as start
- Select every kth unit afterwards
- Select the n units from the list according to: R, R+k, R+2k, …
^the subsequent units are predetermined by the step
Pro & con of systematic sampling [2013]
Pro: simple to implement
Con: unbiased variance estimation may not be possible
{imprecise estimation if the variable of interest displays CYCLICAL behaviour for every kth step on the list}
How to measure the quality of an estimate?
Measured by survey error
- the estimate minus the quantity being estimated
e.g. yS − yU = 2.50 − 3.27 = −0.77
(1 - n/N) Finite population correction - when does it approach 1? What is its purpose?
Explain the circumstances under which it should be used. Would it make much difference in this situation? [2m, 2020]
- If the sampling fraction n/N is small & if the population size is large, ie. if only a SMALL FRACTION of a LARGE POP. is surveyed
- REDUCES the VARIANCE of the statistic{/estimate} when a LARGE proportion of the sampling frame has been sampled.
- FPCF should be used if sample size is LARGE RELATIVE to the population size.
> negligible if n<0.1N ie. if sample size is small relative to the pop. size (from ST107)
f, the larger the sampling fraction, the more precise the estimator
Discuss whether we can calculate the sampling distribution of the sample mean for subjective sampling.
[3m, 2011]
- In subjective sampling, the probability of sampling a unit is typically different for each unit, and hard or impossible to estimate.
- Thus, we cannot obtain the sampling distribution of the sample mean either.