Sampling and Estimation Flashcards

1
Q

_____ _____ refers to selecting a sample when we know the probability of each sample member in the overall population.

A

Probability sampling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

With _____ sampling, each item is assumed to have the same probability of being selected.

A

Random

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

_____ _____ sampling is an appropriate method if we want to estimate the mean profitability for a population of firms.

A

Simple random

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

_____ _____ is based on either low cost and easy access to some data items, or on using the judgment of the researcher in selecting specific data items.

A

Non-probability sampling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

_____ _____ is the difference between a sample statistic (such as the mean, variance, or standard deviation of the sample) and its corresponding population parameter (the true mean, variance, or standard deviation of the population).

A

Sampling error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

The _____ _____ of the sample statistic is a probability distribution of all possible sample statistics computed from a set of equal-size samples that were randomly drawn from the same population.

A

Sampling distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

_____ sampling, selecting every nth member from a population.

A

Systematic

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

______ _____ sampling uses a classification system to separate the population into smaller groups based on one or more distinguishing characteristics.

A

Stratified random

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

_____ _____ is often used in bond indexing because of the difficulty and cost of completely replicating the entire populationof bonds.

A

Stratified sampling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

In _____ sampling, we are assuming that each subset is representative of the overall population with respect to the item we are sampling.

A

Cluster

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

In _____ cluster sampling, a random sample of clusters is selected and all the data in those clusters comprise the sample.

A

One-stage

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

In _____ cluster sampling, random samples from each of the selected clusters comprise the sample.

A

Two-stage

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

_____ cluster sampling can be expected to have greater sampling error than one-stage cluster sampling.

A

Two-stage

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Lower cost and less time required to assemble the sample are the primary advantages of _____ sampling

A

Cluster

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

_____ sampling may be most appropriate for a smaller pilot study.

A

Cluster

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

_____ sampling refers to selecting sample data based on its ease of access, using data that are readily available.

A

Convenience

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

_____ sampling refers to samples for which each observation is selected from a larger data set by the researcher, based on her experience and judgment.

A

Jugmental

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Central limit theorem: If the sample size n is sufficiently large (n ≥ _____ ), the sampling distribution of the sample means will be approximately normal.

A

30

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Formula: The variance of the distribution of sample means (central limit theorem)

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Formula: The standard error of the sample mean when the standard deviation of the population, σ, is known.

21
Q

Formula: The standard error of the sample mean when the standard deviation of the population, σ, is unknown.

22
Q

The value of the standard error of the sample mean _____ from as the sample size _____.

A

Decreases, increases

23
Q

Name the three (3) desirable properties of an estimator.

A

Unbiasedness, efficiency, consistency

24
Q

An _____ estimator is one for which the expected value of the estimator is equal to the parameter you are trying to estimate.

25
An _____ estimator is one for which the variance of its sampling distribution is smaller than all the other unbiased estimators of the parameter you are trying to estimate.
Efficient
26
_____ _____ are single (sample) values used to estimate population parameters.
Point estimates
27
A _____ _____ is a range of values in which the population parameter is expected to lie.
Confidence interval
28
In the context of confidence intervals, α is called _____ .
Level of significance
29
In the context of confidence intervals, α-1 is called _____ .
Degree of confidence
30
A _____ is usually constructed by adding or subtracting an appropriate value from the point estimate
Confidence interval
31
In the context of confidence intervals, Zα/2 is called _____ .
Reliability factor
32
For a 90% confidence intervals, the reliability factor is equal to _____.
zα/2 = 1.645
33
For a 95% confidence intervals, the reliability factor is equal to _____.
zα/2 = 1.960
34
For a 99% confidence intervals, the reliability factor is equal to _____.
zα/2 = 2.575
35
''After repeatedly taking samples of CFA candidates, administering the practice exam, and constructing confidence intervals for each sample's mean, 99% of the resulting confidence intervals will, in the long run, include the population mean.'' This is an example of a _____ interpretation.
Probabilistic
36
''We are 99% confident that the population mean score is between 73.55 and 86.45 for candidates from this population.'' This is an example of a _____ interpretation.
Practical
37
If the population has a normal distribution with a known variance, a confidence interval for the population mean is built using the _____ -statistic.
z-statistic
38
If the population has a normal distribution with a unknown variance, a confidence interval for the population mean is built using the _____ -statistic.
t-statistic
39
Name the two steps to finding a t-value in a table.
1. Compute the degrees of freedom (n – 1) 2. Find the appropriate level of alpha or significance (one-tailed test: α ; two-tailed test: α/2)
40
We cannot create a confidence interval if the distribution is _____ and the sample size is _____.
Non-normal, less than 30
40
We cannot create a confidence interval if the distribution is _____ and the sample size is _____.
Non-normal, less than 30
41
The resampling method in which we calculate multiple sample means, each with one of the observations removed from the sample is called _____.
Jackknife
42
The resampling method in which we draw repeated samples of size n from the full data set and directly calculate the standard deviation of these sample means as our estimate of the standard error is called _____.
Bootstrap
43
Name the two potential limitations of larger sample sizes.
1. May contain observations from a different population (distribution) and reduce the precision of our population parameter estimates 2. Higher cost
44
The bias in which analysts repeatedly use the same database to search for patterns or trading rules until one that "works" is discovered is called _____.
Data snooping
45
The bias which occurs when some data is systematically excluded from the analysis, usually because of the lack of availability is called _____.
Sample selection bias
46
Most mutual fund databases only include funds currently in existence. They do not include funds that have ceased to exist due to closure or merger. This is an example of the _____ bias.
Survivorship
47
The _____ bias occurs when a study tests a relationship using sample data that was not available on the test date.
Look-ahead
48
The _____ bias can occur if the time period over which the data is gathered is either too short or too long.
Time-period