CFA 10: Sampling and Estimation Flashcards
sampling
Sampling
The process of obtaining a sample.
statistic
Sampling
A quantity computed from or used to describe a sample of data.
sampling plan
Sampling
A set of rules used to select a sample.
simple random sample
Sampling
A subset of a larger population created in such a way that each element of the population has an equal probability of being selected to the subset.
systematic sampling
Sampling
With systematic sampling, we select every Kth member until we have a sample of the desired size. The sample that results from this procedure should be approximately random.
sampling error
Sampling
Any difference between the sample mean and the population mean; teh difference between the observed value of astatistic and the quantity it is intended to estimate.
sampling distribution
Sampling
The sampling distribution of a statistic is the distribution of all the distinct possible values that the statistc can assume when computed from samples of the same size randomly drawn from the same population.
stratified random sampling
Sampling
The population is divided into subpopulations (strata) based on one or more classification criteria. Simple random samples are then drawn from each stratum in sizes proportional to the relative size of each stratum in the population. These samples are then pooled to form a stratified random sample.
indexing
Sampling
An investment strategy in which an investor constructs a portfolio to mirror the performance of a specified index.
The Central Limit Theorem
Distribution of the Sample Mean
The central limit theorem states that for large sample sizes, for any underlying distribution for a random variable, the sampling distribution of the sample mean for that variable will be approximately normal, with mean equal to the population mean for that random variable and variance equal to the population variance divided by sample size.
estimators
Point and Interval Estimates of the Population Mean
An estimation formula; the formula used to compute the sample mean and other sample statistics are examples of estimators.
estimate
Point and Interval Estimates of the Population Mean
The particular value calculated from sample observations using an estimator.
point estimate
Point and Interval Estimates of the Population Mean
A single numerical estimate of an unknown quantity, such as a population parameter.
unbiased estimator
Point and Interval Estimates of the Population Mean
One whose expected value (the mean of its sampling distribution) equals the parameter it is intended to estimate.
efficiency (in an unbiased estimator)
Point and Interval Estimates of the Population Mean
An unbiased estimator is efficient if no other unbiased estimator of the same parameter has a sampling distribution with smaller variance.
consistency (in an estimator)
Point and Interval Estimates of the Population Mean
A consistent estimator is one for which the probability of estimates close to to the value of the population parameter increases as sample size increases.
confidence interval
Point and Interval Estimates of the Population Mean
A confidence interval is a range for which one can assert with a given probability 1-a, called the degree of confidence, that it will contain the parameter it is intended to estimate. This interval is often referred to as the 100(1-a)% confidence interval for the parameter.
construction of confidence intervals
Point and Interval Estimates of the Population Mean
A 100(1-a)% confidence interval for a parameter has the following structure:
Point estimate (+/-) Reliability factor x standard error
where
point estimate = a point estimate of the parameter (a value of a sample statistic)
reliability factor = a number based on the assumed distribution of the point estimate and the degree of confidence (1-a) for the confidence interval.
standard error = the standard error of the sample statistic providing the point estimate
degrees of freedom (df)
Point and Interval Estimates of the Population Mean
The number of independent observations used.
data mining
More on Sampling
The practice of determining a model by extensive searching through a dataset for statistically significant patterns.
out-of-sample test
More on Sampling
A test of a strategy or model using a sample outside the time period on which the strategy or model was developed.
intergenerational data mining
More on Sampling
A form of data mining that applies information developed by previous researchers using a dataset to guide current research using the same or a related dataset.
sample selection bias
More on Sampling
When data availability leads to certain assets being excluded from the analysis.
survivorship bias
More on Sampling
The bias resulting from a test design that fails to account for companies that have gone bankrupt, merged, or are otherwise no longer reported in a database.
look-ahead bias
More on Sampling
A bias caused by using information that was unavailable on the test date.
time-period bias
More on Sampling
The possibility that when we use a time-series sample, our statistical conclusion may be sensitive to the starting and ending dates of the sample.