(R10) Sampling and Estimation Flashcards
Define Simple Random Sampling and provide two methods
Each element has an equal probability of being chosen; 1) random number generate or 2) select every kth element
Sampling Distribution
The distribution of all distinct possible values that a statistic can assume when computed from samples of the same size randomly drawn from the same population
Sampling error
The difference between the observed value of a statistic and the quantity it is intended to estimate (Sample mean - population mean)
Stratified Random Sampling
- Separate the population into smaller groups based on one more distinguishing characteristics; then use simple random sampling
- provides more precise mean and variance
Three Data Types
- Time Series
- Cross-Sectional
- Panel
Time Series Data
Take a variable or multiple variables and observe how the variables change over a period of time
i.e. Monthly returns on Microsoft stock from Jan 1994 to Dec 2004.
Cross-Sectional Data
Multiple observational units at a point in time
i.e Sales for 30 different companies for a particular quarter
Longitudinal Data
Observations over time of multiple characteristics of the same entity, such as unemployment, inflation anf GDP growth rates, for a country over 10 years.
Panel Data
Time series + cross sectional. Data that contains observations over time of the same characteristic for multiple entities, such as debt/equity ratios for 20 companies over 24 quarters.
Standard Error Formula
Standard deviation divided by square root of n; the standard deviation of the distribution of the sample means.
Central Limit Theorem
Theorem that states for simple random samples of size n, from a population with a mean u, and a finite variance, sigma^2, the sampling distribution of the sample mean, Xbar, approaches a normal probability distribution with mean u, and a variance equal to sigma^2 / N as the sample size becomes large.
Properties of CLT
- If the sample size, n, is sufficiently large (n>= 30), the sampling distribution of the sample means will be approximately normal.
- The mean of the population, u, and the mean of the distribution of all possible sample means are equal.
- **The variance of the distribution of sample means is sigma^2 /N. the population variance divided by the sample size.
Desired Properties of an Estimator
- Unbiasedness - when the expected value of the estimator is equal to the parameter you are trying to estimate.
- Efficient – if the variance of its sampling distribution is smaller than all the other unbiased estimators of the parameter you are trying to estimate.
- Consistent - the accuracy of the parameter estimate increases as the sample size increases.
Point Estimates
Sample mean and sample variance are point estimates
Confidence Interval Formula
Point estimate +/- reliability factor * standard error
C.I. = Xbar + z * (sigma / n^(1/2))