10.2: Confidence Intervals and t-distribution Flashcards

You may prefer our related Brainscape-certified flashcards:
1
Q

What is a point estimate? How is it computed? Provide an example.

A

Point estimates are single sample values used to estimate population parameters.

Computation:
mean = sum of single sample values/size of sample

The value generated is called the point estimate of the mean.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is student’s t-distribution and when is it used? How does it compare to the normal distribution?

A

Student’s t-distribution is a bell-shaped probability distribution that is symmetrical about its mean. It is used when constructing confidence intervals based on small smalls (where n < 30) from populations with unknown variance and a normal distribution.

Compared to normal distribution, t-distribution is flatter with fatter tails.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the properties of student’s t-distribution?

A
  1. Symmetrical
  2. Defined by a single parameter, the degrees of freedom, which equals to n - 1
  3. Has fatter sales than the normal distribution.
  4. As the degrees of freedom (sample size) gets larger, the shape of the t-distribution approaches a normal distribution.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What happens to t-distribution when the degrees of freedom increases? What happens when degrees of freedom increases without bounds?

A

When degrees of freedom increases, the centre becomes more spiked and its tails become thinner.

When degrees of freedom increases without bounds, t-distribution converges to the standard normal distribution (z-distribution).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is degrees of freedom?

A

Degrees of freedom is the number of observations, which is calculated as n - 1.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are fat tails an indication of?

A

Fat tails mean that there are more outliers (observations away from the centre of the distribution).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How are confidence intervals for a random variable that follows a t-distribution related to degrees of freedom?

A

Confidence intervals for a random variable that follows a t-distribution must be wider when the degrees of freedom are less (fatter tails) for a given significance level, and narrower when the degrees of freedom are more (thinner tails) for a given significance level.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is a confidence interval?

A

Confidence interval estimates result in a range of values within which the actual value of a parameter will lie, given the probability of 1 - alpha which is referred to as the degree of confidence.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is alpha?

A

Alpha is the level of significance for confidence interval.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How are confidence intervals constructed?

A

CIs are constructed by adding or subtracting an appropriate value from the point estimate.

Point estimate plus minus (reliability factor x standard error)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How is the confidence interval for the population mean calculated, given that the population has a normal distribution with a known variance?

A

With known variance and normal distribution, CI is calculated as:

Point estimate for population mean plus minus reliability factor times standard deviation over the square root of sample size

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the reliability factor for 90% CI?
What is the reliability factor for 95% CI?
What is the reliability factor for 99% CI?

A

Reliability factor for 90% CI = 1.645 (significance level is 10%, 5% in each tail)
Reliability factor for 95% CI = 1.960 (significance level is 5%, 2.5% in each tail)
Reliability factor for 99% CI = 2.575 (significance level is 1%, 0.5% in each tail)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How is the confidence interval for the population mean calculated, given that the population has a normal distribution with an unknown variance?

A

With unknown variance and normal distribution, CI is calculated as:

Point estimate for population mean plus minus t-reliability factor, corresponding to degrees of freedom 1 - n, times the standard deviation over the square root of the sample size.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How is the confidence interval created for a non-normal distribution?

A

If the sample size is less than 30 (n < 30), confidence intervals cannot be constructed.

If the sample size is greater than 20 (n > 30)

  • variance is known, use z-statistic
  • variance is unknown, use t-statistic
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are the two limitations to using a larger sample size?

A
  1. Larger sample sizes may contain observations from a different population, which can reduce the precision of population parameter estimates.
  2. Cost of using a larger sample should be weighed against the value of the increase in precision from the increase in sample size.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is data mining? What are the warning signs of data mining?

A

Data mining occurs when analysts repeatedly use the same database to search for patterns or trading rules until one that works is discovered.

Warning signs:

  1. Evidence that many different variables were tested, most of which are unreported, until significant ones were found.
  2. The lack of any economic theory that is consistent with the empirical results.
17
Q

What is data-mining bias?

A

Data mining bias refers to results where the statistical significance of the pattern is overestimated because the results were found through data mining.

18
Q

What is the best way to avoid data mining?

A

The best way to avoid data mining is to test a potentially profitable trading rule on a data set different from the one used to develop the rule.

19
Q

What is sample selection bias?

A

Sample selection bias occurs when some data is systematically excluded from the analysis, because of the lack of availability. This results in a non-random observed sample and any conclusions drawn from this sample cannot be applied to the population.

20
Q

What is survivorship bias? What is an example of survivorship bias? What is the solution?

A

Survivorship bias is a result of excluding data that no longer exist from the sample so that the result is an overestimation.

Example: mutual funds

Solution: use a sample that all started at the same time and do not exclude data that have been removed

21
Q

What is look-ahead bias?

A

Look-ahead bias occurs when a study tests a relationship using sample data that was not available on the test date.

22
Q

What is time-period bias?

A

Time-period bias can result if the time period over which the data is gathered is either too short or too long.