1. Time Series 2. Cross-Sectional 3. Panel

If the sample size, n, is sufficiently large (n>= 30), the sampling distribution of the sample means will be approximately normal. The mean of the population, u, and the mean of the distribution of all possible sample means are equal. The variance of the distribution of sample means is sigma^2 /N. the population variance divided by the sample size.

(R10) Sampling and Estimation Flashcards by Kyle Greene

Define Simple Random Sampling and provide two methods

Each element has an equal probability of being chosen; 1) random number generate or 2) select every kth element

How well did you know this?

Not at all

Perfectly

Sampling Distribution

The distribution of all distinct possible values that a statistic can assume when computed from samples of the same size randomly drawn from the same population

How well did you know this?

Not at all

Perfectly

Sampling error

The difference between the observed value of a statistic and the quantity it is intended to estimate (Sample mean - population mean)

How well did you know this?

Not at all

Perfectly

Stratified Random Sampling

Separate the population into smaller groups based on one more distinguishing characteristics; then use simple random sampling
provides more precise mean and variance

How well did you know this?

Not at all

Perfectly

Three Data Types

Time Series
Cross-Sectional
Panel

How well did you know this?

Not at all

Perfectly

Time Series Data

Take a variable or multiple variables and observe how the variables change over a period of time
i.e. Monthly returns on Microsoft stock from Jan 1994 to Dec 2004.

How well did you know this?

Not at all

Perfectly

Cross-Sectional Data

Multiple observational units at a point in time

i.e Sales for 30 different companies for a particular quarter

How well did you know this?

Not at all

Perfectly

Longitudinal Data

Observations over time of multiple characteristics of the same entity, such as unemployment, inflation anf GDP growth rates, for a country over 10 years.

How well did you know this?

Not at all

Perfectly

Panel Data

Time series + cross sectional. Data that contains observations over time of the same characteristic for multiple entities, such as debt/equity ratios for 20 companies over 24 quarters.

How well did you know this?

Not at all

Perfectly

Standard Error Formula

Standard deviation divided by square root of n; the standard deviation of the distribution of the sample means.

How well did you know this?

Not at all

Perfectly

Central Limit Theorem

Theorem that states for simple random samples of size n, from a population with a mean u, and a finite variance, sigma^2, the sampling distribution of the sample mean, Xbar, approaches a normal probability distribution with mean u, and a variance equal to sigma^2 / N as the sample size becomes large.

How well did you know this?

Not at all

Perfectly

Properties of CLT

If the sample size, n, is sufficiently large (n>= 30), the sampling distribution of the sample means will be approximately normal.
- The mean of the population, u, and the mean of the distribution of all possible sample means are equal.
**The variance of the distribution of sample means is sigma^2 /N. the population variance divided by the sample size.

How well did you know this?

Not at all

Perfectly

Desired Properties of an Estimator

Unbiasedness - when the expected value of the estimator is equal to the parameter you are trying to estimate.
Efficient – if the variance of its sampling distribution is smaller than all the other unbiased estimators of the parameter you are trying to estimate.
Consistent - the accuracy of the parameter estimate increases as the sample size increases.

How well did you know this?

Not at all

Perfectly

Point Estimates

Sample mean and sample variance are point estimates

How well did you know this?

Not at all

Perfectly

Confidence Interval Formula

Point estimate +/- reliability factor * standard error

C.I. = Xbar + z * (sigma / n^(1/2))

How well did you know this?

Not at all

Perfectly

Distribution with known variance, which table should be used to create confidence interval?

Study These Flashcards

Use Z score

Distribution with unknown variance, which table should be used to create confidence interval?

Study These Flashcards

Use t score if sample is less than 30; use t or z score if sample is greater than 30

Level of Significance

Study These Flashcards

How confident your estimate is, denoted by alpha

Characteristics of T-Distribution

Study These Flashcards

Centered at Zero
Flatter than a normal distribution
As df increases, shape becomes more spiked and tails become thinner.
t-test levels of significance only correspond to one tail probabilities

Confidence intervals are affected by:

Study These Flashcards

z score or t score
alpha - level of confidence
n - number of samples

Data mining bias

Study These Flashcards

Bias that refers to results where the statistical significance of the pattern is overestimated because the results were found through data-mining (the practice of hitting a data set over and over again until you hit gold)

Sample selection bias

Study These Flashcards

Bias which occurs when some data is systematically excluded from the analysis, usually because of the lack of availability (survivorship bias in mutual funds)

Look ahead basis

Study These Flashcards

Occurs when a study tests a relationship using sample data that was not available on the test date (i.e. stock price/returns vs. accounting data)

Time period basis

Study These Flashcards

Results only apply for that specific time period

Unbiased estimator

When the expected value of the estimator is equal to the parameter you are trying to estimate.

Efficient Estimator

If the variance of its sampling distribution is smaller than all the other unbiased estimators of the parameter you are trying to estimate.

Consistent Estimator

The accuracy of the parameter estimate increases as the sample size increases.

(R10) Sampling and Estimation Flashcards

(27 cards)