Week 3: Chapter 5&9 Flashcards
GM assumptions 1-5, what’s the sample property
finite-sample property
- Holds for any sample size n, so long as n > k+1
CLM assumptions 1-6
Exact sampling property
- Violation of MLR 6 invalidates our inference
Two implications of a large sample (as n approaches infinity)
- Bj has an approximately normal distribution as n n approaches infinity)
- t and F statistics have approximately t and F distributions when as n approaches infinity
Unbiased VS Consistency
Unbiasedness:
- On average B^ equals B
- The midpoint of the distribution of B^ is B
- Nothing to do with spread or distribution of B^
Consistency:
- As we add more observations, the distribution of B^ gets more tightly distributed around B
- Tells you something about the speed and distribution of B^
Consistency definition
B^j is consistent when Var(B^j) and E(B^j) become more tightly distributed around Bj, until they collapse into a single value, as n tends to infinity
Consistency formula P(Wn…)
notes
Show how consistency follows from unbiasedness
Notes
What happens when Cov(u,x) / Var ( x) = 0?
Plim B^1 = B1
MLR4’
E(u) = 0 and Cov(xi,u) = 0.
- Zero Mean and Zero Correlation assumption
Requires that each xj is uncorrelated with u and E(u)
- Still consistent
What does MLR1-4 Imply
OLS estimators are unbiased and consistent
What does MLR1-4’ Imply
OLS estimators are consistent
What does RESET test stand for and what does it test for?
Regression Estimation Specification Error Test
- Tests for non-linearities within the model
- RESET tests for functional form misspecification brought about by the exclusion of higher order polynomials of our x’s
RESET equation and why there is no y^4
check notes for the equation.
- No y^4 as it uses up too many dofs
What is the null of RESET test?
What does it mean when we fail to reject it?
H0: δ1 = δ2 = 0
If we fail to reject, means the original model was correct
What are the drawbacks of the RESET test?
A drawback with RESET is that it provides no real direction on how to proceed if the model is rejected. THUS IT IS JUST A FUNCTIONAL FORM TEST
Mizon-Richard Test
- focuses on log transformations
- y = g0 + g1x1 + g2x2 + g3log(x1) + g4log(x2) + u (g = a number symbol)
Null and alternative hypothesis for MR test
H0: g3 = 0, g4 = 0 (original equation correctly specified)
H1: g1 = 0, g2 = 0
David-MacKinnon Test
States that the fitted values from one model should be insignificant when added to another model, yˇ and y^ (fitted values)
1. y = b0 + b1x1 + b2x2 + d1yˇ + error
2. y = b0 + b1log(x1) + b2log(x2) + q1yˆ + error
H0 : d1 = 0 and H0 : q1 = 0
How do you know if its a good proxy variable?
The closer Cor(proxy, variable) is to 1, the better the proxy
The proxy variable final model
y = alpha0 + b1x1 + b2x2 + alpha3x3 + e
What can we use instead of a proxy variable?
Lagged dependent variable
- accounts for historical factors that cause current differences
Examples of measurement errors
Self-reported income, weight etc.
Always have errors, people misinterpret the information
Proxy VS Measurement Error
- In the proxy case, the omitted variable is important to the extent to which it affects our other independent variables
- In the measurement error case, the mismeasurement in the independent variable is the issue
Measurement error in the dependent variable:
- Model
- Implications
y* = 𝛽0 + 𝛽1x1 + . . . + 𝛽kxk + u
y = b0 + b1x1 + . . . + bkxk + v
V: e0 = y - y* (measurement error)
GM assumptions still hold, still consistent
Measurement error in explanatory variables:
y = 𝛽0 + 𝛽1x1 + u
- CANNOT observe x1, have to observe x1
e1=x1- x1* and assume E(e1)=0, E(y|x1,x1)=E(y|x1)
When does ME in dependent variables cause consistent estimators?
For consistent estimates of Bj, we require E(e0|x) = E(e0) = 0
- e0 is independent of x and has zero mean
When does ME in independent variables cause consistent estimators?
- e1 is uncorrelated with x1: Cov(x1, e1) = 0
- e1 is uncorrelated with x1* : Cov(x1*, e1) = 0
e1 is uncorrelated with x1: Cov(x1, e1) = 0, what is the model for this?
y=b0 +b1x1 + (u- b1e1)
e1 is uncorrelated with x1: Cov(x1, e1) = 0, create the model and what is it called?
- show how plim bˆ1 does not = b1
Classical errors-in-variables (CEV assumption)
- plim bˆ1 = b1 + Cov(x1, u - b1e1)/Var(x1)
- Cov(x1, u - b1e1) = Cov(x1, - b1e1) = - b1Cov(x1, e1)
find rest in notes
GIVEN THIS, OLS PRODUCES BIASED AND INCONSISTENT ESTIMATORS
Show what the Attentuation bias is
notes, always less than 1
What are some examples of issues that will lead to bias in the Random Sampling (MLR2) assumption?
- Missing data
- Non-random samples
- Outliers
What happens when data is missing at random?
Estimators are less precise (SSTx is lower in smaller samples), BUT they are still unbiased
Exogenous Sample Selection
Sample selection based on the independent variables, can still be unbiased
Endogenous Sample Selection
Sample selection based on the dependent variables, will lead to biased coefficients