Sampling And Estimation Flashcards

0
Q

Sampling plan

A

The set of rules used to select a sample.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
1
Q

Parameter

A

A descriptive measure computed from or used to describe a population of data, conventionally represented by Greek letters.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Simple random sample

A

A subset of a larger population created in such a way that each element of the population has an equal probability of being selected to the subset.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Systematic sampling

A

A procedure of selecting every kth member until reaching a sample of the desired size. The sample that results from this procedure should be approximately random.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Sampling error

A

The difference between the observed value of a statistic and the quantity it is intended to estimate.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Stratified random sampling

A

In stratified random sampling, the population is divided into subpopulations (strata) based on one or more classification criteria. Simple random samples are then drawn from each stratum in sizes proportional to the relative size of each stratum in the population. These samples are then pooled to form a stratified random sample.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Indexing

A

An investment strategy in which an investor constructs a portfolio to mirror the performance of a specified index.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Monetary policy

A

Actions taken by a nation’s central bank to affect aggregate output and prices through changes in bank reserves, reserve requirements, or its target interest rate.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Sharpe ratio

A

The average return in excess of the risk-free rate divided by the standard deviation of return; a measure of the average excess return earned per unit of standard deviation of return.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Central limit theorem

A

Given a population described by any probability distribution having mean μ and finite variance σ2, the sampling distribution of the sample mean X ( x bar*) computed from samples of size n from this population will be approximately normal with mean μ (the population mean) and variance σ2/n (the population variance divided by n) when the sample size n is large.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Standard error of the sample mean

A

For sample mean X⎯⎯⎯ calculated from a sample generated by a population with standard deviation σ, the standard error of the sample mean is given by one of two expressions:

Equation (1) 

σX⎯⎯⎯=σ / √n

when we know σ, the population standard deviation, or by

Equation (2) 

sX⎯⎯⎯= s /√n

when we do not know the population standard deviation and need to use the sample standard deviation, s, to estimate it.6

In practice, we almost always need to use Equation 2. The estimate of s is given by the square root of the sample variance, s2, calculated as follows:

Equation (3) 
2 2
s =∑(Xi−X⎯⎯⎯) / n−1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Properties of the distribution of the sample mean

A

The distribution of the sample mean X⎯⎯⎯ will be approximately normal.

The mean of the distribution of X⎯⎯⎯ will be equal to the mean of the population from which the samples are drawn.

The variance of the distribution of X⎯⎯⎯ will be equal to the variance of the population divided by the sample size.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Estimator

A

An estimation formula; the formula used to compute the sample mean and other sample statistics are examples of estimators.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Point estimate

A

A single numerical estimate of an unknown quantity, such as a population parameter.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Unbiased estimator

A

An unbiased estimator is one whose expected value (the mean of its sampling distribution) equals the parameter it is intended to estimate.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Efficiency of an unbiased estimator

A

An unbiased estimator is efficient if no other unbiased estimator of the same parameter has a sampling distribution with smaller variance.

16
Q

Consistency of an estimator

A

A consistent estimator is one for which the probability of estimates close to the value of the population parameter increases as sample size increases.

17
Q

Confidence interval

A

Definition of Confidence Interval. A confidence interval is a range for which one can assert with a given probability 1 − α, called the degree of confidence, that it will contain the parameter it is intended to estimate. This interval is often referred to as the 100(1 − α)% confidence interval for the parameter.

18
Q

Construction of confidence intervals

A

A 100(1 − α)% confidence interval for a parameter has the following structure.

Point estimate ± Reliability factor × Standard error
where

Point estimate = a point estimate of the parameter (a value of a sample statistic)

Reliability factor = a number based on the assumed distribution of the point estimate and the degree of confidence (1 − α) for the confidence interval

Standard error = the standard error of the sample statistic providing the point estimate13

19
Q

Confidence Intervals for the Population Mean (Normally Distributed Population with Known Variance)

A

A 100(1 − α)% confidence interval for population mean μ when we are sampling from a normal distribution with known variance σ2 is given by

X⎯ ± z(α/2) σ/√n

20
Q

Reliability Factors for Confidence Intervals Based on the Standard Normal Distribution

A

We use the following reliability factors when we construct confidence intervals based on the standard normal distribution:

90 percent confidence intervals: Use z0.05 = 1.65

95 percent confidence intervals: Use z0.025 = 1.96

99 percent confidence intervals: Use z0.005 = 2.58

21
Q

Confidence Intervals for the Population Mean—The z-Alternative (Large Sample, Population Variance Unknown)

A

A 100(1 − α)% confidence interval for population mean μ when sampling from any distribution with unknown variance and when sample size is large is given by

X⎯ ± zα/2 s/√n

22
Q

Degrees of freedom (df)

A

The number of independent observations used.

23
Q

Confidence Intervals for the Population Mean (Population Variance Unknown)—t-Distribution.

A

If we are sampling from a population with unknown variance and either of the conditions below holds:

the sample is large, or

the sample is small but the population is normally distributed, or approximately normally distributed,

then a 100(1 − α)% confidence interval for the population mean μ is given by 

X⎯±tα/2 s/√n

where the number of degrees of freedom for tα/2 is n − 1 and n is the sample size.

24
Q

Data mining

A

The practice of determining a model by extensive searching through a dataset for statistically significant patterns. Also called data snooping.

25
Q

Out of sample test

A

The practice of determining a model by extensive searching through a dataset for statistically significant patterns. Also called data snooping.

26
Q

Intergenerational data mining

A

A form of data mining that applies information developed by previous researchers using a dataset to guide current research using the same or a related dataset.

27
Q

Sample selection bias

A

Bias introduced by systematically excluding some members of the population according to a particular attribute—for example, the bias introduced when data availability leads to certain observations being excluded from the analysis.

28
Q

Survivorship bias

A

The bias resulting from a test design that fails to account for companies that have gone bankrupt, merged, or are otherwise no longer reported in a database.

29
Q

Look ahead bias

A

A bias caused by using information that was unavailable on the test date.

30
Q

Time period bias

A

The possibility that when we use a time-series sample, our statistical conclusion may be sensitive to the starting and ending dates of the sample.