Part 5. Sampling & Estimation Flashcards
Simple Random Sampling
A method of selecting a sample in such a way that each item or person in the population being studied has the same likelihood of being included in the sample.
e.g. picking random numbers out of a bag.
Systematic sampling
Another way to form an approximately random sample, by selecting every nth member from a population.
Sampling error
The difference between a sample statistic (the mean, variance, or standard deviation of the sample) and its corresponding population parameter (the true mean, variance or standard deviation of the population).
sampling error of the mean = sample mean (x-) - population mean (u)
Sampling distribution
(Of a sample statistic)
A probability distribution of all possible sample statistics computed from a set of equal-size samples that were randomly drawn from the same population.
Sampling distribution of the mean
Suppose a random sample of 100 bonds is selected from the population of a major municipal bond index consisting of 1000 bonds, and then the mean return of 100-bond sample is calculated.
Repeating this process many times will result in many different estimates of the population mean return.
Stratified random sampling
Uses a classification system to separate the population into smaller groups based on one or more distinguishing characteristics.
From each subgroup (stratum), a random sample is taken and the results are pooled, the size of the samples from each subgroup (stratum) is based on its size relative to the population.
Stratified Sampling Example
Used often in bond indexing, due to the difficulty and cost of completely replicating the entire population of bonds.
The bonds in a population are categorised (stratified) according to major bond risk factors such as duration, maturity, coupon rate, and the like.
The samples are drawn from each separate category and combined to form a final sample.
Time series data
This consists of observations taken over a period of time at specific and equally spaced time intervals.
e.g. the set of monthly returns on Microsoft stock from January 1994 to January 2004.
Cross-sectional data
A sample of observations taken at a single point in time.
e.g. the sample of reported earnings per share of all Nasdaq companies as of Dec 31, 2004.
Longitudinal data
Observations over time of multiple characteristics of the same entity, such as unemployment, inflation and GDP growth rates for a country over 10 years.
Panel data
This contains observations over time of the same characteristic for multiple entities, such as debt/equity ratios for 20 companies over the most recent 24 quarters.
Central Limit Theorem
For simple random samples of size n from a population with mean (u) and finite variance (sigma^2).
The sampling distribution of the sample mean (x-) approaches a normal probability distribution with mean (u), and a variance equal to (sigma^2/n) as the sample size becomes large.
Useful as the normal distribution is relatively easy to apply to hypothesis testing, and construction of confidence intervals.
Inferences about population mean can be made from sample mean, regardless of the populations distribution, as long as sample size is “sufficiently large”, usually mean n>/30.
Important properties of central limit theorem:
- If the sample size n is sufficiently large (n>/30), the sampling distribution of the sample means will be approx. normal.
- So random samples of size n are repeatedly being taken from overall larger population, with each random sample having its own mean itself being a random variable, and this set sample means has a distribution that is approx. normal. - The mean of the population (u), and the mean of the distribution of all possible sample means are equal.
- The variance of the distribution of sample means is sigma^2/n, the population variance divided by sample size.
Standard deviation of the means of multiple samples:
This is less than the standard deviation of single observations.
If standard deviation of monthly stock returns is 2%, the standard error (deviation) of the average monthly return over the next six months is 2%/root6 = 0.82%.
The average of several observations of random variable will be less widely dispersed (lower standard dev) around the expected value than will a single observation of the random variable.
Desirable properties of an estimator:
- Unbiasedness
- Efficiency
- Consistency
Unbiasedness
The one for which the expected value of the estimator is equal to the parameter you are trying to estimate.
E(x-) = u, i.e. EV of sample mean = population mean
Efficiency
If the variance of its sampling distribution is smaller than all the other unbiased estimators of the parameter you are trying to estimate.
i.e. sample mean, an unbiased and efficient estimator of population mean.
Consistency
The one for which the accuracy of the parameter estimate increases, meaning the standard error of sample mean falls and sampling distribution bunches more closely around the population mean.
As the sample size approaches infinity, the standard error approaches zero.
Point estimates
These are single (sample) values used to estimate population parameters.
Estimator = the formula used to compute the point estimate.
Confidence intervals
A range of values in which the population parameter is expected to lie.
Student’s t-distribution
A bell shaped probability distribution that is symmetrical about its mean.
Appropriate for:
- Constructing confidence intervals based on small samples (n<30) from populations with unknown variance and normal/approx normal distribution.
- When population variances is unknown, and sample size is large enough that the central limit theorem will assure the sampling distribution is approx. normal.
Properties of students t-distribution:
- Its symmetrical.
- Defined by a single parameter, the degrees of freedom (df) equal to the number of sample obs minus 1, n-1, for sample means.
- Has more probability in tails (fatter tails) than normal distribution.
- As degrees of freedom (sample size) gets larger, the shape of t-distribution more closely approaches a standard normal distribution.
Student t-distribution movement
As number of observations increase (df increase), the t-distribution becomes more spiked and tails become thinner.
Df increases without bound, the t-dist converges to the standard normal distribution (z-distribution).
The thickness of tails relative to those z-distribution is important in hypothesis testing as thicker tails mean more observations away from center of distribution (more outliers).
Hypothesis testing using t-distribution makes it more difficult to reject the null relative to hypothesis testing using z-distribution.
Confidence interval
This estimates result in a range of values within which the actual value of a parameter will lie giventhe probability of 1 - alpha (a).
These are constructed by adding or subtracting an appropriate value from the point estimate.