10.1: Central Limit Theorem and Standard Error Flashcards
What is simple random sampling? Provide an example.
Simple random sampling is a method of selecting a sample in a way that each item or person in the population being studied has the same likelihood of being included in the sample.
An example of simple random sampling is placing 50 items into a hat, each numbered, and drawing from the hat 5 times consecutively. The 5 drawn numbers comprise a simple random sample from the population.
What is systematic sample? Provide an example.
Systematic sample is approximated random sampling where every nth member from a population is selected.
What is sampling error?
Sampling error is the difference between a sample statistic (mean, variance, or standard deviation of the sample) and its corresponding population parameter (true mean, variance, or standard deviation of the population).
What is sampling distribution?
Sampling distribution of the sample statistic is a probability distribution of all possible sample statistics from a set of equal-size samples that were randomly drawn from the same population.
Remember that each sample drawn will have a different sample statistic and a set of these samples will achieve an approximately normal distribution.
What is stratified random sampling?
Stratified random sampling uses a classification system to separate the population into smaller groups based on one or more distinguishing characteristics. Each group is called a stratum, and a random sample is taken from each stratum. The results are pooled.
The size of the samples from each stratum is based on the size of the stratum relative to the population.
What is time-series data? Provide an example.
Time series data consist of observations taken over a period of time at specific and equally spaced time intervals.
Example: Monthly returns on a stock from January 2002 to January 2008.
What is cross-sectional data? Provide an example.
Cross-sectional data are a sample of observations taken at a single point in time.
Example: The sample of reported EPS of all NASDAQ companies as of December 31, 1998.
What is longitudinal data? Provide an example.
Longitudinal data are observations over time of multiple characteristics of the same entity, like unemployment, inflation, and GDP growth rates.
What is panel data? Provide an example.
Panel data contain observations over time of the same characteristic for multiple entities, such as debt/equity ratios for 20 companies over the most recent 24 quarters.
Explain the central limit theorem. What are the 3 properties of central limit theorem?
Central limit theorem states that for simple random samples of size n from a population with a mean and finite variance, the sampling distribution of the sample mean x approaches a normal probability distribution with mean u and variance equal to variance/n as the sample size increases.
Properties:
- Sample size n is sufficiently large such that n >= 30 so that the sampling distribution of the sample means will be approximately normal.
- The mean of the population, and the mean of the distribution of all possible sample means are equal.
- The variance of the distribution of sample means is variance/n, which is the population variance over the sample size.
What is the standard error of the sample mean? How is it calculated?
Standard error of the sample mean is the standard deviation of the distribution of the sample means.
It is calculated as: standard deviation of the population/square root of the size of the sample.
As the sample size increases, the standard error of the sample mean decreases because the sample mean gets closer, on average to the true mean of the population.
What are the 3 desirable properties of an estimator? (UEC) Explain each.
- Unbiasedness. Unbiased estimator is one for which the expected value fo the estimator is equal to the parameter being estimated.
- Efficiency. Unbiased estimator is efficient if the variance of its sampling distribution is smaller than all the other unbiased estimators of the parameter being estimated.
- Consistency. A consistent estimator is one for which the accuracy of the parameter estimate increases as the sample size increases.
When is standard deviation and standard deviation/square root of the sample size used?
Standard deviation/square root of the sample size is used when there is a relatively large sample size (or when there are multiple samples).