5.2: Coincidence Intervals, Resampling, and Sampling Biases Flashcards

1
Q

Definition of Confidence Interval

A

A range for which one can assert with a given probability 1 − α, called the degree of confidence, that it will contain the parameter it is intended to estimate.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How standard normal random deviation is denoted?

A

Z

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What does Zα denote?

A

The point of the standard normal distribution such that α of the probability remains in the right tail.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the considerations for Confidence Intervals for the Population Mean?

A

1) Whether it is normally distributed or not,
2)What is the sample size,
3)Whether we know the variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How to calculate Confidence Intervals for the Population Mean if it is normally distributed with known variance?

A

x̄ ± Za/2 * σ / √n

σ - population standard deviation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Reliability Factors for Confidence Intervals Based on the Standard Normal Distribution:

90% confidence intervals: Use z0.05 = ?
95% confidence intervals: Use z0.025 = ?
99% confidence intervals: Use z0.005 = ?

A

= 1.65
= 1.96
= 2.58

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Formula for Confidence Intervals for the Population Mean—The z-Alternative (Large Sample, Population Variance Unknown):

A

x̄ ± Za/2 * S / √n

s - sample standard deviation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

In which cases t-distribution is appropriate for calculating Confidence Intervals for the Population Mean?

A

1)When the population variance is unknown,

2)When the variance is unknown and even in cases where the samples is large and Z reliability factor could be used from CLT,

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

In which cases t-distribution can be used for calculating Confidence Intervals for the Population Mean?

A

The sample is large,

Or the sample is small, but the population is normally distributed, or approximately normally distributed.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Confidence Intervals for the Population Mean (Population Variance Unknown) - t-Distribution formula?

A

x̄ ± ta/2 * S / √n

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

The Bootstrap Resampling Method is often referred to as?

A

Model-free or non-parametric resampling.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How does Bootstrap Resampling Method work?

A

It mimics the process by treating the randomly drawn sample as if it were the population.

Often used to find standard error or construct confidence intervals of population parameters.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the bootstrap sampling distribution formula to approximate the true sampling distribution?

(To estimate the standard error of the sample mean)

A

S x̄ = √(1 / B-1)* Σ(01 – 02)^2

B denotes the number of resamples drawn from the original sample,

θˆb (O1) denotes the mean of a resample,

θbar (O2) denotes the mean across all the resample means.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the Jackknife resampling method?

A

A resampling method that repeatedly draws samples by taking the original observed data sample and leaving out one observation at a time (without replacement) from the set.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is Data Snooping?

A

Data Snooping relates to overuse of the same or related data through extensive searching through a dataset for statistically significant patterns.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is Sample Selection Bias?

A

Systematically excluding some members of the population according to a particular attribute.

Example, the bias introduced when data availability leads to certain observations being excluded from the analysis.

17
Q

Definition of Survivorship Bias is…

A

The exclusion of poorly performing or defunct companies from an index or database, biasing the index or database toward financially healthy companies.

18
Q

Implicit Selection Bias is introduced through…

A

The presence of a threshold that filters out some unqualified members.

19
Q

How do you call a scenario in which the hedge funds that are added to databases and indexes only after they are initially successful create a certain type of bias?

A

Backfill Bias

20
Q

What is the name of bias that occurs when the testing used information that was unavailable at the moment?

A

Look-ahead Bias

21
Q

The possibility that when we use a time-series sample, our statistical conclusion may be sensitive to the starting and ending dates of the sample is called?

A

Time-period Bias