Module 10.2: Confidence Intervals and t-Distribution Flashcards

Question 1

Q

What are point estimates?

Answer

A

single values used to estimate population parameters. popular point estimate is the mean of a sample.

Question 2

Q

What is a confidence interval?

Answer

A

range of values in which the population parameter is expected to lie.

Question 3

Q

What is Student’s t-distribution?

Answer

A

bell-shaped probability distribution that is symmetrical about its mean. it is the appropriate distribution to use when constructing confidence intervals based on small sample from populations with unkown variance and a normal distribution.

Question 4

Q

What are the properties of a Student’s t-distribution?

Answer

A

1) It is symmetrical
2) defined by a single parameter, the degrees of freedom (df) where the degrees of freedom are equal to the number of sample observations minus 1.
3) Has more probability in the tails than the normal distribution.
4) As the degrees of freedom gets larger, the shape of the t-distribution more closely approaches a standard normal distribution.

Question 5

Q

How does a t-distribution compare to a normal distribution?

Answer

A

flatter with more area under the tails. as the degrees of freedom in a t-distribution increase, its shape approaches that of a normal distribution.

Question 6

Q

when is it easier to reject the null, when using t-distribution or normal distribution (z-distribution)?

Answer

A

when using t-distribution because there’s more area under the tails so higher chance of observations away from the center of the distribution.

Question 7

Q

What is a confidence interval? also explain the level of significance.

Answer

A

estimates result in a range of values within which the actual value of a parameter will lie, given the probability of 1-a. here a is called the level of significance for the confidence interval. 1-a = degree of confidence.

Formula for confidence intervals = point estimate +/- (reliability factor x standard error)

Question 8

Q

What is the reliability factor?

Answer

A

The probability that the point estiamte falls in the confidence interval, (1-a)

Question 9

Q

How is a confidence interval calculated in a normal distribution with a known variance?

Answer

A

point estimate (mean) +/- Za/2 (reliability factor) * standard error

Question 10

Q

What are the most commonly used standard normal distribution reliability factors?

Answer

A

1) za/2 = 1.645 for 90% confidence intervals (level of sig is 10%, 5% each tail).
2) za/2 = 1.960 for 95% confidence intervals (the sig level is 5%, 2.5% in each tail)
3) za/2 = 2.575 for 99% co nfidence intervals (sig level is 1%, .5% in each tail)

Question 11

Q

How do you calculate confidence intervals with a normal population but unkown variance?

Answer

A

point estimate (mean) +/- ta/2 (standard error).

Question 12

Q

What are the steps to pull the t-values from the t-table?

Answer

A

1) calculate degrees of freedom which is t-1.
2) find the appropriate level of alpha or significance. a/2.
3) look up the chart.

Question 13

Q

How do you construct a confidence interval for a distribution that is nonnormal? Can it be achieved?

Answer

A

1) if the distribution is nonnormal, but the population variance is known, the z-statistic can be used as long as the sample size is greater than 30.
2) if the distribution is nonormal and the pop variance is unkown, the t-statistic can be used as long as population size > 30.

Question 14

Q

What happens if the sampling of the population is not random?

Answer

A

the central limit theorem doesn’t apply, our estimates won’t have the desirable properties, and we can’t form unbiased confidence intervals.

Question 15

Q

What are the two downsides to “larger is better” when it comes to selecting an appropriate sample size?

Answer

A

1) Risk of selecting observations from a different population.
2) cost - larger sample size is more expensive.

Question 16

Q

What is data mining? data mining bias?

Answer

A

when analysts repeatedly use the same database to search for patterns or trading rules until one that “works” is discovered.

Data mining bias referes to results where the statistical significance of the pattern is overestimated because the results were found through data mining.

Question 17

Q

What is sample selection bias?

Answer

A

Occurs when some data is systematically excluded from the analysis, usually because lack of availability.

Question 18

Q

What is survivorship bias?

Answer

A

Only including investments that have “survived” in the population, not the ones that have failed.

Question 19

Q

What is look-ahead bias?

Answer

A

using sample data that was not available on the test date. Using estimates instead of actuals.

Question 20

Q

What is time-period bias?

Answer

A

Time period when data is collected being too short or long.