Lecture 1 Flashcards
(18 cards)
Unbiasedness means
On average, OLS estimators will give us the true population parameters, so if we repeat sample til infinity, average of OLS estimates will converge to true population values.
- bear in mind, its about the method, not the final number
SLR.1
Assumption of linearity in parameters, i.e. the model can be written as:
Y = B0 + B1x + u
Assumption SLR.2
Assumption of random sampling, sample size n {(xi,yi)} : i = 1,…,n}
- we can treat y,x and u for each observation as i.i.d variables, meaning each observation is independent from the others and all come from the same distribution
- ensures that our sample is randomly drawn from population, so estimates of parameters will be valid and representative of the population as a whole
Assumption SLR.3
Assumption of sample variation in the explanatory variable
- essentially assuming not all values of the independent variable xi are not all the same in the sample, which would make it impossible to estimate the relationship between x and y
SLR.4
Assumption of zero conditional mean
- E[u|x] = 0
- mean of u doesn’t change with x
- crucial to show OLS estimator is unbiased, i.e. no correlation between u and x
By showing B1^’s distribution is centered at B1…
We are showing the OLS estimator is unbiased, so on average, across possible samples - B1^=B1
Under SLR.4:
- E[B0^|Xn] = …
- E[B1^|Xn] = …
- B0 and B1
By LIE:
E[B1^] = E[E[B1^|Xn]] = E[B1] = B1
Proof of unbiasedness in 3 steps:
- Obtain a convenient expression of the estimator
- Write estimator = population parameter + sampling error
- Show E[sampling error] = 0
SLR.5
Assumption of homoskedasticity
- the error has the same variance given any value of x
- hard to find realistic cases in practise, e.g. different family incomes might have different family savings variance
Var(u|x) = o^2 >0 for all x
E{y|x} = …
Var(y|x) = …
- B0 + B1x
- o^2
Var (B1^|Xn) =
Var(B0^| Xn) =
- o^2/SSTx
- (o^2.sum (xi^))/ n.SSTx
The more noise in the relationship between y and x, i.e. larger variability in u, harder it is to learn about B1, as it would increase the variance of B1^
By contrast, more variation in x is a good thing, as it reduces Variance of B1^
The inverse 1/Var(B1^|Xn)
= SSTx/o^2 is often referred to as the signal to noise ratio, high signal to noise ratio implies high precision of B1^
What is the sample variance in x? What happens as n gets large
O^2^ = SSTx/n-1
True variance is SSTx/n
Sample variance is SSTx/n-1
By LLN, as n tends to infinity, sample variance will tend to true variance
Now the variance of b1 estimator conditional on x, once you rearrange for SSTx and substitute, will be 1/n-1
Thus, as n grows, variance of b1 estimator = 1/n-1, formally showing why more data is better
We can’t actually observe ui, so we have to estimate it
We replace it with it’s estimate, ui^, but this is always smaller than the true ui, as the OLS chooses B0 and B1’s estimators to fit the data the closest
- there are two constraints on ui, so we take two degrees of freedom
O^2^ = SSR/n-2
Bias corrected by the degrees of freedom adjustment
Standard error of the regression
O^ = (SSR/n-2)^0.5
Standard error of B1^
Se(B1^) = o^/(SSTx)^0.5
Take the root of the variance of B1 estimator
What actually is B1?
B1 is the ceteris paribus effect of x on y
What does the PRF show?
The straight line is the mean of your when x is fixed, the spread is caused by u
- we assume that as n tends to infinity, values will converge to this line