Lecture 3 Flashcards

1
Q

We now want to test hypotheses about the Bjs, why and what?

A

So first of course we estimated the value, now we need to use the data to determine whether or not the hypothesis is likely to be false or not
- h0: B1 = 0

But for hypothesis testing we need to know the full distribution too

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What do we know about the distribution of u?
- why is this even relevant?

A

MLR.4 - E[u|xi] = 0
MLR.5 - Var[u|xi] = o^2
- the estimator is expressed as the true parameter + a sum of error terms ui weighted by the coefficients wij, which depend on the sample Xn, showing the variability in Bj estimator is driven by the error term
- therefore, since Bj estimator depends on ui, its distribution inherits characteristics from the distribution of the errors ui, which are random

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

MLR.6

A

Assumption of Normality, the population error u is independent of xi, and is normally distributed with mean 0 and variance o^2
- u - N(0,o^2)
- MLR.6 implies MLR.4 and MLR.5 - a much stronger assumption
- we have now made a very specific distributional assumption for u - the familiar bell shaped curve

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Is normality a reasonable assumption?
- how does it tie in with the CLT?

A

CLT suggests that as each error term is the sum of many small independent factors, if each factor follows a similar distribution and is independent, CLT suggests u will tend to be normally distributed
- assumption may be violated in applications, but maintained for convenience of statistical inference

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Normal sampling distributions under MLR.1-MLR.6

A

Bj^ - N(Bj, Var(Bj^|Xn))

(Bj^-Bj)/sd(Bj^|Xn) - N(0,1)

Standardised random variable has 0 mean and variance 1 under MLR.1-4, and now under 6 it is normally distributed.
- result holds regardless of Xn

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Can we directly use the result

(Bj^ - Bj)/(sd(Bj^|Xn) - N(0,1)

A

No as the denominator depends on o = sd(u), which is unknown, but we can use o^ as an estimator of o
- using this in place of o gives us the standard error se(Bj^)

TBj = (Bj^-Bj)/(se(Bj^))

T distribution allows us to perform hypothesis tests even without knowing the true variance o

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Replacing o with o^ takes us from the standard normal to the _

A

T distribution
- also bell shaped but more spread out than the N(0,1)
- since we are adding a new estimator, which varies across samples, it introduces additional variability in the estimation, shifting us from N to a distribution which accounts for this sample variability
- bell shaped, has heavier tails though - to account for variability
- as df = n - k - 1 > 120, t distribution approaches the normal distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

So what does the t statistic tend to look like in practise?

A

As h0: bj = 0

Tbj^ = (Bj^)/se(Bj^)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Standard approach to hypothesis testing

A
  1. Choose a null hypothesis
  2. Choose an alternative hypothesis
  3. Find a good test statistic for testing H0 against H1
  4. Choose a significance level for the test
  5. Choose a critical value, so the rejection rule t>c implies we make T1 errors at the close sig level:
    Pr(t>c|H0 true) = sig level
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

To reduce the probability of a T1 error we must

A

Increase the critical value, usually meaning lowering the significance level

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What do we do for a 2 tailed test?

A

Rejection rule is :

|t|>c

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is a p value?
- high or low means what?
- what do we do when it is two tailed?

A

Smallest significance level at which we can still reject H0

I.e., tells us the probability of obtaining a test statistic as extreme as or more extreme than the observed value, assuming H0 is true
- Lower p value is better, a low probability of seeing such an extreme result if H0 were true
- if two tailed, then multiply by 2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What’s the use of confidence intervals?

A

Confidence intervals give a range of values within which the true population parameter is likely to fall, offering a way to understand the precision of an estimation beyond simple hypothesis testing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Sometimes we want to test more than one restriction/ hypothesis, which then includes multiple parameters, what do we do?

A

We use a new statistic to test these joint hypotheses
- can’t rely on individual t statistics for each parameter as each tests a hypothesis about a single parameter in isolation, which doesn’t account for the joint impact of multiple parameters together

Use F statistics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Alternative to joint null hypothesis e.g.

A

H0: B3 = B4 = 0 at 5 %
H1: H0 is not true

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Why dont t stats work for MERs?

A

If we want to test a joint hypothesis, we need to avoid relying on individual tests only, as each test has a certain error rate, and combining them without proper control leads to an overall error rate which is too high
- not accounting for size control

17
Q

What is size control?

A

Refers to the probability of making a T1 error in hypothesis testing, each test carries its own chance of T1 error, sos combined error rate can exceed the 5%

E.g. probability that each test doesn’t reject null could be 95%, so 0.95x0.95 = 0.9025, therefore a 9,75% chance that at lest one of these tests would reject the null even if both parameters are truly 0, so prob of rejecting null would exceed 5%

18
Q

How does F statistic work with different model fits
- unrestricted vs restricted?
- SSR?

A

Unrestricted:
- lsalary = B0 + B1years + B2… + u
Restricted:
- lsalary = B0 + B1years + B2gamesyr + u

Our test statistic will compare the fit of the restricted and unrestricted model
- its an algebraic fact that the SSR must increase when xs are dropped so SSR>/ SSRur, so does SSR increase by enough to conclude the restrictions under H0 are false?

19
Q

F statistic

A

F = ((SSRr - SSRur)/(dFr - dFur))/(SSRur/dFur)

= ((SSRr - SSRur)/(q))/((SSRur)/(n-k-1))
If F>C, then reject h0

20
Q

Whats the p value in f tests?

A

P value = Pr( F> Fobs| H0 is true)
- stata automatically reports the p value with each test