Misspecification and Model Selection Flashcards by lew lee

3 types of misspecification

Omitted relevant variables (underfitting)

Inclusion of irrelevant variables (overfitting)

Incorrect function form

How well did you know this?

Not at all

Perfectly

Omitting relevant variables:

In a misspecified model, what happens to our estimate of β₁?

β~₁ - which is β₁ plus something else

How well did you know this?

Not at all

Perfectly

What is the final expectation of β~₁

E(β~₁) = β₁+β₂ x Cov(Xi,Zi)/Var(Xi) ≠ β₁

Zi is the relevant variable that has been omitted.
And obviously ≠ β₁ since omitted Zi, so misspecified and OLS is biased!

How well did you know this?

Not at all

Perfectly

So OLS is biased in this underspecified model:

Unless.. (2)

Cov(Xi,Zi)=0 (Zi unrelated to Xi)

β₂=0 i.e Zi is not actually relevant!

How well did you know this?

Not at all

Perfectly

How do we know the sign of bias, i.e know if we are overestimating or underestimating β₁?

If covariance and β₂ have same signs i.e both > 0 or both <0 , we get positive bias i.e overestimating β₁.

if they have opposite signs i.e Cov>0 but β₂<0, or Cov<0 and β₂>0 , we get negative bias, underestimate β₁

(and of course if either =0 no bias! as mentioned in previous FC)

How well did you know this?

Not at all

Perfectly

Suppose we estimate wage on extra year of education.

True model is
ln(wagei) = β₀+ β₁ Educi + β₂ Abilityi + εi

But of course ability is hard to measure. So we omit it.

What is our β~₁ (use formula)

β~₁=β₁+β₂ x Cov(Educ,Ability)/Var(Educ)

How well did you know this?

Not at all

Perfectly

using this, would our estimate of β₁ be positive (overestimated) or negative (underestimated) bias?

Since we expect Cov(Educ,Ability) to be postively correlated, and β₂ to be > 0…

β~₁=β₁+β₂ x Cov(Educ,Ability)/Var(Educ) > β₁

Positively biased! Overestimated

How well did you know this?

Not at all

Perfectly

So β₁ likely upward biased: (return to education is bigger than true value)

How can we proceed from here (3)

Measure ability (HARD!)

Experiment: give random amount of education to people

Quasi experiments i.e replicate in natural settings

How well did you know this?

Not at all

Perfectly

Detecting an omitted relevant variable
Consider true model again
Yi = β₀ + β₁ Xi + β₂ Zi + εi

But we estimate the misspecified model
Yi = β₀ + β₁ Xi + vi

How to test? Natural suggestion would be to specify
vi = γ₀ +y₁Zi + εi (Z is contained in error term v)
And test whether γ₁=0 (if not, Z is a relevant!)

Problems with this suggestion (2)

vi is not observed.
Eval: take residuals from misspecified to make an estimate of vi v^i)

We dont know what Z is (otherwise would’ve included it in our model i.e the true model)

So no good test for omitted relevant variables, so use economic theory and intuition and think about what bias they would bring to the parameters

How well did you know this?

Not at all

Perfectly

2nd misspecification: Including irrelevant variables
Consider model:
Yi = β0 + β1 Xi + β2 Ii + εi

Where I is irrelevant. What does it mean for the coefficent β₂

The true population coefficient β₂ is 0

How well did you know this?

Not at all

Perfectly

So I is irrelvant, and so we should get β^₂=0

What happens for our estimate of β₁ , and why?

Nothing - still unbiased.

True model is just a restricted version of estimated model (since β₂=0) so treat like doesn’t exist anyway!

How well did you know this?

Not at all

Perfectly

So under classical assumptions: what does this mean for E(β^j)?

E(β^j) = βj for all values

e.g
E(βˆ₀) = β₀, E(βˆ₁) = β₁, E(βˆ₂) = β₂ = 0

How well did you know this?

Not at all

Perfectly

So bias is not an issue for inclusion of irrelevent variables: but what is?

bias-variance tradeoff

How well did you know this?

Not at all

Perfectly

Bias variance tradeoff captured: First part: Bias
True model
Yi = β0 + β1 Xi + β2 Zi + εi

where β₂ may = 0

What are 2 estimates of β₁?

β^₁ from: Y^i = βˆ₀ + β₁ˆXi + βˆ₂Zi

β~₁ from: Y~i = β~₀+β~₁Xi

Recall omitted relevant variable bias: if β₂≠0 and Cov≠0 , β~₁ is biased, while B^₁ isn’t.

How well did you know this?

Not at all

Perfectly

2nd part of Bias-Variance tradeoff:
Variance of the 2 estimators

B) which has a lower variance (unless)…

Var(β~₁) = σ² /Σ(Xi - Xbar)²

Var(β^₁) = σ²/(1-Rzx)Σ(Xi-Xbar)²

β~₁ is better, unless R²zx=0 (Xi & Zi uncorrelated)

So using β^₁ better for unbiasness (unless β₂=0 and Cov=0 so β~₁ is also unbias ), B~₁ better for variance unless R²zx=0

How well did you know this?

Not at all

Perfectly

Bias variance trade off summary

When β₂≠0

When β₂=0

B^₁ is unbiased
B~₁ is bias

But var (β~₁) < var (β^₁)

B)
Both unbiased
var(β~₁) < var (β^₁) (Again, so β~₁ clearly preferred, i.e not including the irrelevant variable is better since it shows the truer effect of Xi on Y)

So using this… when does the bias variance tradeoff exist

When B2≠0

What estimator is preferred in large samples

B^₁ , as unbiased unconditionally, and large samples variance decrease with sample size (N)

3rd misspecification: functional form misspecification

when we do not properly account for the shape of the relationship between dependent and independent variables.

example: return to experience in a wage equation
ln(wagei) = β₀+ β₁ agei + νi

But true model includes a squared term: ln(wagei)=β₀+β₁agei+β₂age²i +εi

To capture DMR! (so shape quadratic not linear)

so it is a type of omitted relevant variables issue (not adding age²)

thus possibly introducing bias (positive bias if Cov and β₂ share same signs, negative for opposite)

2 tests for functional form misspecification

Ramsey RESET (REgression specification error test)

Donaldson-Mackinnon test (test for non-nested alternatives)

Ramsey reset:
Assume general model
Yi =β₀ +β₁ X₁i +…+βk Xki +εi

We want to test whether to include any secondn order terms (squared variables, or multplicative terms e.g X x Z)

We could include them all in the model, but why don’t we?

A lot of variables = lots of parameters (k) to estimate: high k loses degrees of freedom

So how does Ramsey RESET work

(And include hypothesis test)

Takes fitted values (Y^i)
Y^i =β^₀ +β^₁ X₁i +…+β^k Xki

Then take polynomial terms of these as additional regressors

So for a second order (testing the squared variables)
Yi =β₀ +β₁ X₁i +…+βk Xki + δ₁Y^²i + ui

H₀:δ₁=0 (no functional form problem)
H₁:δ₁≠0 (functional form problem i.e need to include squared terms)

This process can be extender to higher-orders

Up to 4th order (variable⁴)

What would the hypothesis’ be for a 4th order misspecification, and what test statistic formula

H0 :δ1 =δ2 =δ3 =0
H1 : δj ≠ 0 for any j

Since a joint significance test, use F test
F = RSSr - RSSu / q
/
RSSu / n - (k+1)

q=3 for the amount of restrictions i.e equal signs)
k is the amount of parameters for the new model with the polynomial terms as the added regressors δ₁Y^₂ etc)

In Ramsey RESET, restricted model is nested within unrestricted open. (Where we just say some parameters jointly equal zero) What if where non nested e.g test Yi =β₀ +β₁ X₁i +...+βk Xki +ui Against Yi =β₀+β₁ ln(X₁i)+...+βk ln(Xki)+vi How does the Donald MacKinnon test work, and test statistic

Obtain 2 fitted values for both equations Yˆi =βˆ₀ +βˆ₁ X₁i +...+βˆk Xki Yˇi =βˇ₀ + βˇ₁ ln(X₁i)+...+βˇk ln(Xki) Then estimate the 2 models to get Yi =β₀ +β₁ X₁i +...+βk Xki +θ₁ Yˇi + ui Yi =β₀ +β₁ ln(X₁i)+...+βk ln(Xki)+θ₂ Yˆi +vi (So the hat signs flip, add Y^i to log model, add Yˇi to linear model) Then test using t-tests to see whether θ₁ and θ₂ are different from zero

If θ₁ ≠ 0 If θ₂ ≠ 0 What is a potential problem?

Evidence against model-in-levels (shouldn’t use model without logs) Evidence against model-in-logarithms (we shouldn’t use logarithm based model) C) a clear winner may not emerge: both or neither may get rejected!

What can we do if neither model is rejected?

Use additional tools sucuh as adjusted R²

What if both are rejected (neither model is good)

May be a third funcitonal form specification we have not considered