Week 4 Flashcards

1
Q
A

Stands for convergence in distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Central Limit Theorem

A

establishes that, in most situations, when independent random variables are added, their properly normalized sum tends toward a normal distribution (a bell curve) even if the original variables themselves are not normally distributed. The theorem is a key concept in probability theory because it implies that probabilistic and statistical methods that work for normal distributions can be applicable to many problems involving other types of distributions.

For proof, c.f. the Galton Board

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q
A

Random sample sign

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q
A

(the probability of false positive) is called size of a test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q
A

power of a test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Type I Error

A

A Type I error (sometimes called a Type 1 error), is the incorrect rejection of a true null hypothesis. The alpha symbol, α, is usually used to denote a Type I error.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Type II Error

A

A Type II error (sometimes called a Type 2 error) is the failure to reject a false null hypothesis. The probability of a type II error is denoted by the beta symbol β.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Setting the alternative hypothesis

A

The choice of the alternative hypothesis depends on the problem under analysis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Defining the test statistic

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Computing the test statistic

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Critical value definition

A

A critical value is a line on a graph that splits the graph into sections. One or two of the sections is the “rejection region”; if your test value falls into that region, then you reject the null hypothesis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Confidence intervals

A

how much uncertainty there is with any particular statistic. Confidence intervals are often used with a margin of error. It tells you how confident you can be that the results from a poll or survey reflect what you would expect to find if it were possible to survey the entire population. Confidence intervals are intrinsically connected to confidence levels.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Test statisitic formula

A

When you run a hypothesis test, you’ll use a distribution like a t-distribution or normal distribution. These have a known area, and enable to you to calculate a probability value (p-value) that will tell you if your results are due to chance, or if your results are die to your theory being correct. The larger the test statistic, the smaller the p-valueand the more likely you are to reject the null hypothesis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

p-value

A

A p value is used in hypothesis testing to help you support or reject the null hypothesis. The p value is the evidence against a null hypothesis (the commonly accepted fact). The smaller the p-value, the strong the evidence that you should reject the null hypothesis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Two types of chi-square statistic tests

A
  1. A chi-square goodness of fit test
  2. A chi-square test for independence
17
Q

A chi-square goodness of fit test - what?

A

determines if a sample data matches a population.

18
Q

A chi-square test for independence - what?

A

compares two variables in a contingency table to see if they are related. In a more general sense, it tests to see whether distributions of categorical variables differ from each another.

19
Q

Chi-square test statistics

A
  • A very small chi square test statistic means that your observed data fits your expected data extremely well. In other words, there is a relationship.
  • A very large chi square test statistic means that the data does not fit very well. In other words, there isn’t a relationship.
20
Q

Error term

A

The error term includes everything that separates your model from actual reality. This means that it will reflect nonlinearities, unpredictable effects, measurement errors, and omitted variables.

Although the terms error and residual are often interchanged, there is an important formal difference. While an error term represents the way observed data differs from the actual population, a residual represents the way observed data differs from sample population data. This means that a residual is often much easier to quantify. Although an error is generally unobservable, a residual is observable.

The residual can be considered an estimate of the true error term.