L4 - Distribution of Regression Estimates Flashcards

Question 1

Q

What can we think of the OLS estimator?

Answer

A

We can think of the OLS estimator as a mapping from the sample moments of the data to the parameters of interest.
Y_i = α+βX_i+ui
Parameters of interest –> α, β,(σ_u)²
Sample Moments –> Y(bar), X(bar), (σY)²(hat), (σX)²(hat),σXY(hat)

The mapping is as follows:

β(hat)=(σXY(hat))/((σX)²(hat))
α(hat)= Y(bar)- β(hat)X(bar)
(σ_u)²= Σ^N_i=1(Y_i - α(hat)-β(hat)X_i)²= (N-1/N-2)*((σY)(hat) -((β(hat))²(σX(hat))²)

Question 2

Q

What are properties of estimators?

Answer

A

1 - An estimator is said to be unbiased if its expected value is equal to the (unknown) true value. E(β(hat)) = β

An estimator is said to be efficient if it has the lowest possible variance in the class of estimators under consideration –> Lowest value of d
- In some circumstances there may be a trade-off between bias and efficiency

Question 3

Q

What is the distribution of the OLS estimator?

Answer

A

We have shown that the OLS estimator for the slope coefficient can be written:

β(hat)= (Σ(X_i-X(bar))(Y_i-Y(bar))/(Σ(X_i-X(bar)²)

= (Σ(X_i-X(bar))Y_i/(Σ(X_i-X(bar)²) since Y(bar)(Σ(X_i-X(bar)) = 0

substiuting for Y and rearranging yields:

β(hat)=β + (Σ(X_i-X(bar))u_i/(Σ(X_i-X(bar)²)

The OLS estimator is therefore a random variable whose distribution depends on the properties of the random variable u.

Question 4

Q

What are the Gauss-Markov assumptions?

Answer

A

Under a specific set of assumptions the OLS estimates can be shown to be the best linear unbiased estimates (BLUE).

1 . The error has expected value zero - E(u{i}) = 0, ∀i

The errors are serially uncorrelated - E(u{i}u{j})= 0, ∀i≠j
The errors have constant variance - E(u{i}^2) = σ{u}^2 0, ∀i
The X variable is non-stochastic (fixed in repeated samples) - E(X{i}u{i})=X{i}E(u{i})

errors follow a normal distribution

Question 5

Q

What does it mean by OLS estimates can be shown to be the best linear unbiased estimates (BLUE)?

Answer

A

E - Estimator - α(hat) and β(hat) are estimators of the true values of α and β

L - Linear - α(hat) and β(hat) are linear estimates - i.e.e the formulae gfor α(hat) and β(hat) are linear combinations of the random variables (Y and possibly X)

U - Unbiased - on avaerage, the actual values of α(hat) and β(hat) will be equal to their true values

B - Best - the OLS estimator β(hat) has minimum variance among the class of linear unbiased estimaors: Gauss-Markov theorem proves that the OLS estimator is best by examining an arbitary alternative linear unbiased estimator and showing in all cases that it must have a variance no smaller than the OLS estimator

Question 6

Q

What can we derive from the Gauss-Markov assumptions?

Answer

A

Using the Gauss-Markov assumptions we can derive the mean and the variance of the OLS estimator.
E(β(hat)) = β
V(β(hat)) = (σ_u)²/(Σ(X_i-X(bar))²)
Under the GM assumptions OLS is the Best Linear Unbiased Estimator (BLUE).
This means that the OLS estimator has the lowest possible variance in the class of linear unbiased estimators.
Alternatively OLS is the most efficient estimator we can use when these assumptions are satisfied.

Question 7

Q

What is a Statistical Inference from the OLS estimator?

Answer

A

If the errors in the regression model follow a normal distribution then the OLS estimator is a linear combination of normally distributed variables.
Therefore the OLS estimator also follows a normal distribution (β^hat)~N ( β , (σ_u)²/(Σ(X_i-X(bar))²)
Statistical inference is the process of using data to make inferences about unknown population parameters
Examples of statistical inference are hypothesis tests about the parameters or the derivation of intervals within which parameters are likely to lie

Question 8

Q

What do you need to conduct a hypothesis test?

Answer

A

To conduct a hypothesis test we need the following:

A hypothesis to be tested (usually described as the null hypothesis) and an alternative against which it can be tested.
A test statistic whose distribution is known under the null hypothesis.
A decision rule which tells us when to reject the null hypothesis and when not to reject it.

Question 9

Q

What is the test statistic for β when performing a hypothesis test?

Answer

A

under the null hypothesis the following random variable is N(0,1):

(β(hat) - β(bar))/ (σ_u/sqrt((Σ(X_i-X(bar))²)) ~N(0,1)

Z=(x-μ/σ)

If the error variance is known then we can calculate this test statistic and compare it with a critcal value from the normal table to decide if we should reject the null

Question 10

Q

What is a Type I Error?

Answer

A

An error-type I is the error of rejecting a true null hypothesis. - Large significance level (large tails) => larger probability of error-type I

Question 11

Q

What is a Type II Error?

Answer

A

cant make the tails as small as possible as increase the likelihood of anerror type II: the probability of not rejecting a false null hypothesis

Question 12

Q

What is the desicion rule in hypothesis testing?

Answer

A

Involves fixing the size of the test to avoid type I and II errors
Fix the probability of when we get type I and II

Question 13

Q

What is a critical value?

Answer

A

A critical value is a value corresponding to a predetermined p-value. For example a 5% critical value is a value of the test statistic which would yield a p-value of 0.05.

Critical values are often set at 10%, 5% or 1% levels. For example, for the standard normal distribution critical values for a 1-tailed test are:

10% - 1.282

5% - 1.645

1%- 2.326

If the test statistic exceeds the critical value then the test is said to reject the null hypothesis at that particular critical value.

Question 14

Q

What is a P-value?

Answer

A

The P value, or calculated probability, is the probability of finding the observed, or more extreme, results when the null hypothesis (H₀) of a study question is true – the definition of ‘extreme’ depends on how the hypothesis is being tested. P is also described in terms of rejecting H₀ when it is actually true, however, it is not a direct probability of this state.

It shows the level of statistical significance –> how probably true the event will be