Sampling and Estimation Flashcards
_____ _____ refers to selecting a sample when we know the probability of each sample member in the overall population.
Probability sampling
With _____ sampling, each item is assumed to have the same probability of being selected.
Random
_____ _____ sampling is an appropriate method if we want to estimate the mean profitability for a population of firms.
Simple random
_____ _____ is based on either low cost and easy access to some data items, or on using the judgment of the researcher in selecting specific data items.
Non-probability sampling
_____ _____ is the difference between a sample statistic (such as the mean, variance, or standard deviation of the sample) and its corresponding population parameter (the true mean, variance, or standard deviation of the population).
Sampling error
The _____ _____ of the sample statistic is a probability distribution of all possible sample statistics computed from a set of equal-size samples that were randomly drawn from the same population.
Sampling distribution
_____ sampling, selecting every nth member from a population.
Systematic
______ _____ sampling uses a classification system to separate the population into smaller groups based on one or more distinguishing characteristics.
Stratified random
_____ _____ is often used in bond indexing because of the difficulty and cost of completely replicating the entire populationof bonds.
Stratified sampling
In _____ sampling, we are assuming that each subset is representative of the overall population with respect to the item we are sampling.
Cluster
In _____ cluster sampling, a random sample of clusters is selected and all the data in those clusters comprise the sample.
One-stage
In _____ cluster sampling, random samples from each of the selected clusters comprise the sample.
Two-stage
_____ cluster sampling can be expected to have greater sampling error than one-stage cluster sampling.
Two-stage
Lower cost and less time required to assemble the sample are the primary advantages of _____ sampling
Cluster
_____ sampling may be most appropriate for a smaller pilot study.
Cluster
_____ sampling refers to selecting sample data based on its ease of access, using data that are readily available.
Convenience
_____ sampling refers to samples for which each observation is selected from a larger data set by the researcher, based on her experience and judgment.
Jugmental
Central limit theorem: If the sample size n is sufficiently large (n ≥ _____ ), the sampling distribution of the sample means will be approximately normal.
30
Formula: The variance of the distribution of sample means (central limit theorem)
Formula: The standard error of the sample mean when the standard deviation of the population, σ, is known.
Formula: The standard error of the sample mean when the standard deviation of the population, σ, is unknown.
The value of the standard error of the sample mean _____ from as the sample size _____.
Decreases, increases
Name the three (3) desirable properties of an estimator.
Unbiasedness, efficiency, consistency
An _____ estimator is one for which the expected value of the estimator is equal to the parameter you are trying to estimate.
Unbiased
An _____ estimator is one for which the variance of its sampling distribution is smaller than all the other unbiased estimators of the parameter you are trying to estimate.
Efficient
_____ _____ are single (sample) values used to estimate population parameters.
Point estimates
A _____ _____ is a range of values in which the population parameter is expected to lie.
Confidence interval
In the context of confidence intervals, α is called _____ .
Level of significance
In the context of confidence intervals, α-1 is called _____ .
Degree of confidence
A _____ is usually constructed by adding or subtracting an appropriate value from the point estimate
Confidence interval
In the context of confidence intervals, Zα/2 is called _____ .
Reliability factor
For a 90% confidence intervals, the reliability factor is equal to _____.
zα/2 = 1.645
For a 95% confidence intervals, the reliability factor is equal to _____.
zα/2 = 1.960
For a 99% confidence intervals, the reliability factor is equal to _____.
zα/2 = 2.575
'’After repeatedly taking samples of CFA candidates, administering the practice exam, and constructing confidence intervals for each sample’s mean, 99% of the resulting confidence intervals will, in the long run, include the population mean.’’
This is an example of a _____ interpretation.
Probabilistic
'’We are 99% confident that the population mean score is between 73.55 and 86.45 for candidates from this population.’’
This is an example of a _____ interpretation.
Practical
If the population has a normal distribution with a known variance, a confidence interval for the population mean is built using the _____ -statistic.
z-statistic
If the population has a normal distribution with a unknown variance, a confidence interval for the population mean is built using the _____ -statistic.
t-statistic
Name the two steps to finding a t-value in a table.
- Compute the degrees of freedom (n – 1)
- Find the appropriate level of alpha or significance (one-tailed test: α ; two-tailed test: α/2)
We cannot create a confidence interval if the distribution is _____ and the sample size is _____.
Non-normal, less than 30
We cannot create a confidence interval if the distribution is _____ and the sample size is _____.
Non-normal, less than 30
The resampling method in which we calculate multiple sample means, each with one of the observations removed from the sample is called _____.
Jackknife
The resampling method in which we draw repeated samples of size n from the full data set and directly calculate the standard deviation of these sample means as our estimate of the standard error is called _____.
Bootstrap
Name the two potential limitations of larger sample sizes.
- May contain observations from a different population (distribution) and reduce the precision of our population parameter estimates
- Higher cost
The bias in which analysts repeatedly use the same database to search for patterns or trading rules until one that “works” is discovered is called _____.
Data snooping
The bias which occurs when some data is systematically excluded from the analysis, usually because of the lack of availability is called _____.
Sample selection bias
Most mutual fund databases only include funds currently in existence. They do not include funds that have ceased to exist due to closure or merger.
This is an example of the _____ bias.
Survivorship
The _____ bias occurs when a study tests a relationship using sample data that was not available on the test date.
Look-ahead
The _____ bias can occur if the time period over which the data is gathered is either too short or too long.
Time-period