Module 10.2: Confidence Intervals and t-Distribution Flashcards

1
Q

What are point estimates?

A

single values used to estimate population parameters. popular point estimate is the mean of a sample.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is a confidence interval?

A

range of values in which the population parameter is expected to lie.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is Student’s t-distribution?

A

bell-shaped probability distribution that is symmetrical about its mean. it is the appropriate distribution to use when constructing confidence intervals based on small sample from populations with unkown variance and a normal distribution.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the properties of a Student’s t-distribution?

A

1) It is symmetrical
2) defined by a single parameter, the degrees of freedom (df) where the degrees of freedom are equal to the number of sample observations minus 1.
3) Has more probability in the tails than the normal distribution.
4) As the degrees of freedom gets larger, the shape of the t-distribution more closely approaches a standard normal distribution.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How does a t-distribution compare to a normal distribution?

A

flatter with more area under the tails. as the degrees of freedom in a t-distribution increase, its shape approaches that of a normal distribution.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

when is it easier to reject the null, when using t-distribution or normal distribution (z-distribution)?

A

when using t-distribution because there’s more area under the tails so higher chance of observations away from the center of the distribution.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is a confidence interval? also explain the level of significance.

A

estimates result in a range of values within which the actual value of a parameter will lie, given the probability of 1-a. here a is called the level of significance for the confidence interval. 1-a = degree of confidence.

Formula for confidence intervals = point estimate +/- (reliability factor x standard error)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the reliability factor?

A

The probability that the point estiamte falls in the confidence interval, (1-a)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How is a confidence interval calculated in a normal distribution with a known variance?

A

point estimate (mean) +/- Za/2 (reliability factor) * standard error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are the most commonly used standard normal distribution reliability factors?

A

1) za/2 = 1.645 for 90% confidence intervals (level of sig is 10%, 5% each tail).
2) za/2 = 1.960 for 95% confidence intervals (the sig level is 5%, 2.5% in each tail)
3) za/2 = 2.575 for 99% co nfidence intervals (sig level is 1%, .5% in each tail)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How do you calculate confidence intervals with a normal population but unkown variance?

A

point estimate (mean) +/- ta/2 (standard error).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are the steps to pull the t-values from the t-table?

A

1) calculate degrees of freedom which is t-1.
2) find the appropriate level of alpha or significance. a/2.
3) look up the chart.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How do you construct a confidence interval for a distribution that is nonnormal? Can it be achieved?

A

1) if the distribution is nonnormal, but the population variance is known, the z-statistic can be used as long as the sample size is greater than 30.
2) if the distribution is nonormal and the pop variance is unkown, the t-statistic can be used as long as population size > 30.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What happens if the sampling of the population is not random?

A

the central limit theorem doesn’t apply, our estimates won’t have the desirable properties, and we can’t form unbiased confidence intervals.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are the two downsides to “larger is better” when it comes to selecting an appropriate sample size?

A

1) Risk of selecting observations from a different population.
2) cost - larger sample size is more expensive.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is data mining? data mining bias?

A

when analysts repeatedly use the same database to search for patterns or trading rules until one that “works” is discovered.

Data mining bias referes to results where the statistical significance of the pattern is overestimated because the results were found through data mining.

17
Q

What is sample selection bias?

A

Occurs when some data is systematically excluded from the analysis, usually because lack of availability.

18
Q

What is survivorship bias?

A

Only including investments that have “survived” in the population, not the ones that have failed.

19
Q

What is look-ahead bias?

A

using sample data that was not available on the test date. Using estimates instead of actuals.

20
Q

What is time-period bias?

A

Time period when data is collected being too short or long.