Statistics 2 Flashcards

1
Q

The process of using what we know about a sample to make probabilistic statements about the broader population.

A

Statistical Inference

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What does statistical inference rely on

A
  • Relies on probability, we can estimate what is going on in the population based on a sample.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Population parameter

A

A quantity of the population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Sample Statistic

A

A quantity of the sample, provides a estimate of the population parameter.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Distributions

A

Distributions are representations of how often each value occurs.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Probability distributions

A
  • Lists the possible outcomes of an events and their probabilities
  • Assigned a probability to each possible value of a random variable.
  • Each probability is a number between zero and one.
  • The sum of the probabilities of all possible values equals 1.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Probability distributions: discrete variables

A
  • If the population proportion who lives in households without children is 80%…
  • That means that the probability that an adult selected randomly from the population lives in a house without children is 80%.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Significance tests

A
  • Hypothesis: Prediction that the parameter takes a particular numerical value or falls in a certain range of values. It is a statement about a population.
  • For example. The mean age of the UK is 50.
  • A statistical significance test uses data to summarize the evidence about a hypothesis – by comparing point estimates of the parameters with the values predicted by the hypothesis.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Five parts of a signifcance test

A

Assumptions
Hypotheses
Test statistic
p-value
Conclusion

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Assumptions

A
  • Type of data: quantitative or categorical?
  • Randomization: assumed randomization in the data gathering, such as random sample.
  • Population distribution: Some tests assume a certain distribution.
  • Sample size: Many tests use a t sampling distribution, or approximately normal. If sample size large enough, no need for normal population distribution.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Null Hypotheses

A
  • The null hypothesis, 𝐻_π‘œ : a statement that the parameter takes a particular value.
  • The mean age of UK population is 50.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

The Alternative hypothesis

A

𝐻_π‘Ž : the parameter falls in some alternative range of values. An effect of some type. This is the research hypothesis.
The mean age of UK voters if higher/lower/different than 50.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Hypothesis in a significance test analysis

A
  • Analyses the sample evidence about H0 by investigating if the data contradicts H0, suggesting that Ha is true.
  • Proof by contradiction
  • Null hypothesis presumed to be true, under this presumption if the data observed is very unusual, we reject the null.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Test statistic

A

The test statistic summarizes how far the estimate falls from the parameter value in 𝐻_π‘œ.
The number of standard errors between the estimate and the 𝐻_π‘œ value.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

P-

A

The probability that the test statistic equals the observed value, or a value even more extreme in the direction predicted by 𝐻_π‘Ž.

  • Smaller the P value the stronger the evidence for Ha.
  • Larger P number means that if HO is true than observed data wouldnt be unusual.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Why do smaller p-value indicate stronger evidence against H0?

A
  • Because the data would then be more unusual if Hβ‚€ were true.
17
Q

What is the a-value/a-level

A
  • The boundary value 0.05 is called the π›Όβˆ’π‘£π‘Žπ‘™π‘’π‘’ or 𝛼 –level of the test.
  • The 𝛼-level thus is a number such that we reject 𝐻_0 if the p-value is less than or equal to it.
  • The Ξ±-level is the significance level.
    Reject 𝐻_π‘œ if p<= Ξ±. Common levels 0.05 and 0.01.
  • The smaller the π›Όβˆ’level the stronger the evidence must be to reject 𝐻_0.
  • To avoid bias in the decision-making process you select 𝛼 before analysing the data.
18
Q

Conclusion of the test

A
  • The p-value summarises the evidence against 𝐻_π‘œ.
  • To draw the conclusion of the test we report and interpret p-values.
  • If the p-value is sufficiently small, we reject 𝐻_π‘œ and accept 𝐻_π‘Ž.
  • P<=0.05 – results are significant at the 0.05 level.
  • If 𝐻_π‘œ were true, the chance of getting such extreme values in the sample data would be smaller than 0.05.