7.1: Sampling Techniques and central limit theorem Flashcards
Probability sampling
selecting a sample when we know the probability of each sample member in the overall population.
random sampling
ach item is assumed to have the same probability of being selected. If we have a population of data and select our sample by using a computer to randomly select a number of observations from the population, each data point has an equal probability of being selected
Nonprobability sampling
based on either low cost and easy access to some data items, or on using the judgment of the researcher in selecting specific data items. Less randomness in selection may lead to greater sampling error.
Stratified random sampling
- classification system -> seperate population into smaller groups based on one or more distinguishing characteritics
- Stratified sampling is often used in** bond indexing** because of the difficulty and cost of completely replicating the entire population of bonds
- In this case, bonds in a population are categorized (stratified) according to major bond risk factors including, but not limited to, duration, maturity, and coupon rate
Cluster sampling
- also based on subsets of a population, but in this case, we are assuming that each subset (cluster) is representative of the overall population with respect to the item we are sampling.
one-stage cluster sampling
a random sample of clusters is selected, and all the data in those clusters comprise the sample.
two-stage cluster sampling
random samples from each of the selected clusters comprise the sample. Contrast this with stratified random sampling, in which random samples are selected from every subgroup.
Convenience sampling
selecting sample data based on ease of access, using data that are readily available. Because such a sample is typically not random, sampling error will be greater.
Judgmental sampling
which each observation is selected from a larger dataset by the researcher, based on one’s experience and judgment
The standard error of the sample mean
the standard deviation of the distribution of the sample means
** resampling** of the data.
reviously, we used the sample variance to calculate the standard error of our estimate of the mean. The standard error provides better estimates of the distribution of sample means when the sample is unbiased and the distribution of sample means is approximately normal.
Two alternative methods of estimating the standard error of the sample mean involve
**
1. Jacknkife
2. Bootstrap**
Then the standard deviation of the population, σ, is known, the standard error of the sample mean is calculated as:
to (σ^2)/n as the sample size becomes large.
- sufficient large n ≥ 30
- µ, and the mean of the distribution of all possible sample means are equal.
- The variance of the distribution of sample means is σ^2/n , the population variance divided by the sample size.
jackknife,
calculates multiple sample means, each with one of the observations removed from the sample. The standard deviation of these sample means can then be used as an estimate of the standard error of sample means. The jackknife is a computationally simple tool and can be used when the number of observations available is relatively small. This method can remove bias from statistical estimates.