Econometrics I Revision Flashcards by Daniel Toman

Explain and derive the OLS estimator

Minimise the sum of squared deviations between the actual and the sample values. You choose β to minimise this between y and its linear approximation given by its conditional expectations.

(y-Xβ)’(y-Xβ)
dy/dx
-2X’y + 2X’Xβhat = 0
βhat = (X’X)-1 X’y
d2y/d2x
2X’X is positive definite

Assumptions:

How well did you know this?

Not at all

Perfectly

Explain and derive MM estimator

MM estimator finds values for β that ensures the sample counterpart of the population moment condition E(X’e)=0 is satisfied:
E(X’e)=1/nX’(Y-Xβ) = 1/n(X’y-X’Xβ) = 0

βhat = (X’X)-1 X’y

How well did you know this?

Not at all

Perfectly

What does (X’X)-1 X’X equal

I - the identity matrix

How well did you know this?

Not at all

Perfectly

Explain the law of iterated expectations

E[E[X1|X2]] = E[X1]

How well did you know this?

Not at all

Perfectly

Explain what random sampling means

The population model has been specified and an independent, identically distributed (iid) sample can be drawn

How well did you know this?

Not at all

Perfectly

Explain what an unbiased estimator means

E(βhat) = β

How well did you know this?

Not at all

Perfectly

Explain zero conditional mean assumption

Population orthogonality condition E(e|X)=0

How well did you know this?

Not at all

Perfectly

Explain when X’X is nonsingular

If non-singular, the linear projection of y on Xn always exists and is unique

How well did you know this?

Not at all

Perfectly

Show the equation when an estimator is consistent

plim(βhat) = β

How well did you know this?

Not at all

Perfectly

What are the assumptions of the linear regression model?

Population orthogonality condition: E(e|x)=0
Full rank: x is an nxk+1 matrix
Linearity: the true model is y=Xβ+e
Spherical disturbances: homoscedasticity and non-autocorrelation E(ee’|X) = σ^2 I

How well did you know this?

Not at all

Perfectly

Explain the BLUE properties

Best: βhat is more efficient than any other unbiased estimator. V(β~) - v(βhat) is a positive definite matrix

Linear: βhat is a linear function

Unbiased: E(βhat) = β

How well did you know this?

Not at all

Perfectly

Explain multicollinearity

It implies the columns of X are linearly independent, X’X will not be invertible, OLS parameters are not identified

How well did you know this?

Not at all

Perfectly

Explain asymptotic inference

If the small sample distribution of an estimator is unknown we can use an asymptotic approximation

How well did you know this?

Not at all

Perfectly

Explain the Central Limit Theorem

If we have an infinite sequence of iid random variables, no matter what their distribution, in the limit they are normally distributed

How well did you know this?

Not at all

Perfectly

Show that βhat is an unbiased estimator of β

βhat = (X’X)-1X’y
E(βhat) = E(X’X)-1X(Xβ+e)
Expand…
Identity matrix!
E(βhat) = E((X’X)-1X’e)
Because of Law of iterated expectations:
E(E(X1|X2)) = E(X1)
E(βhat) = (X’X)-1X’E(E|X)=0
E(βhat) = E(β) = β

How well did you know this?

Not at all

Perfectly

Show that βhat is a consistent estimator of β

Study These Flashcards

An estimator βhat is consistent if and only if plimβhat = β
βhat = (X’X)-1X’y = β+(X’X)-1X’e = β+(X’X/n)-1(X’e/n)
plimβhat = plim(…)= plim(β) + plim(…) = plim(β) + plim(X’X/n)-1)plim(X’e/n)
Using law of large numbers:
lim(X’X/n)=Q=E(X’X)
lim(X’E/n)=0=E(X’e)
plim(βhat)=β + (Q-1 x 0) = β

Explain the assumptions of unbiasedness and consistency of a linear estimator

Study These Flashcards

Linearity in parameters
Full rank of X: X is an nxk matrix
Random sampling
Expected value of errors conditional on X is zero: E(e|X) = 0

What are the consequences for OLS estimation if the actual variance covariance matrix is σ2Ω instead of σ2I?

Study These Flashcards

The new variance is: (see booklet)
Whilst βhat remains unbiased and consistent, it will no longer be best or asymptotically efficient. This implies there is an alternative estimator with minimum variance.

Although it is unbiased, its variance will be based on the wrong expression i.e. σ^2(X’X)-1. This implies any attempts to use standard t, F, or Wald tests will lead to inaccurate results

Provide an illustration of what the matrix Ω would look like if the errors are heteroscedastic. How would you proceed?

Study These Flashcards

see diagram in booklet.

How to correct for it:
- Correct standard errors. Use White’s heteroscedastic robust standard errors (explain?)
- If you know the precise form of heteroskedasticity, you can correct the estimator for it using GLS/WLS:

In OLS, least squares, all observations weighted equally. Easiest example is heteroskedasticity. When we use GLS, we assume we know the form of heteroskedasticity (h_i). Divide our equation by sqrt(h_i), we get a homoskedastic equation. This gets us a model with homoskedastic errors. Take LS of the second equation. This is now weighted.

What are the implications of statistical inference if the error terms are correlated within clusters in data? Give an empirical example. How would you proceed?

Study These Flashcards

Error terms are no longer independent. OLS is no longer efficient and the standard error is downward biased.
e.g.) test scores and parental income. If there are school/teacher effects or teacher effects, there will be some correlation within a school or a class.
You can proceed by clustering standard errors at the level you are concerned about correlation (school/class level).
You can measure the size of the bias via the Moulton factor:
- See booklet for equation: How much larger is the clustered SE compared with the normal one?

As ρu and number of observations per group rises, so does the magnitude of the bias.

Assume z, a nxm matrix of valid instruments can be chosen for x. What conditions must the instruments in z satisfy?

Study These Flashcards

Exclusion restriction. z is uncorrelated with the error term. E(z1’e)=0
z has to be partially correlated with the endogenous regressor, X.
- see booklet for equation

Derive a consistent estimator for β using 2SLS

Study These Flashcards

If we have m instrumental variables for X such that E(Zh’e)=0. Assuming Zk is partially correlated with X, we have a countless number of IV estimators Z1…Zm.

Of all all possible linear combinations, 2SLS chooses the one most correlated with X. The linear projection of X on Z:
X = δ0 + δ1X1 + δ2X2 + θ1Z + θ1Z1 + … + θmZm + rk
Since rk has zero mean and is uncorrelated with all other exogenous variables:
X = δ0 + δ1X1 + δ2X2 + θ1Z + θ1Z1 + … + θmZm

We can consistently estimate the reduced form making the standard OLS assumption:
Xhat = δhat0 + δhat1X1 + δhat2X2 + θhat1Z + θhat1Z1 + … + θhatmZm

The IV estimator can be defined as:
E(Xhat’e) = 1/n Xhat’ehat = 1/n Xhat’(y-xβhat) = 0

βhat = (Xhat’X)-1Xhat’y

Derive a consistent estimator for β using Generalised Method of Moments

Study These Flashcards

We have m population moment conditions
E(Z’e) = (…) = 0
With sample analogues:
- See booklet for equation
Solve for β by minimizing the quadratic form:
Q(β) = (…)
W is given by the asymptotic covariance matrix of the moment condition

Since ei are i.i.d., sigma2 will weight each moment the same.

Sub into quadratic form:
- See booklet

Give an example of an application where you need to develop a consistent β estimator when there is an instrumental variable. Explain the identification problem and propose two potential instruments that could be used

Study These Flashcards

The effect of education on earnings, affected by underlying ability. You cannot measure this.

As a result you will have biased estimates that are inconsistent because of OVB.

You can solve this with an instrumental variable that affects education but not outcome:
- Time spent in compulsory school years. This is not based on ability
- Distance from school. Someone close to school may undergo more education.

Explain what is meant by the weak instruments problem. What are the consequences for instrumental variables estimation

There is not that strong a relationship between Z and X. As a result, the result of 2SLS estimation is bias - towards the OLS estimates. If there isn't much of a relationship between X and Z, this is more similar to OLS. The size of the bias: Use an F-test. A low F implies a high level of bias. Adding useless instruments will increase bias.

Outline how you would estimate panel data if you believe X effects are random and independent. How do you know if parameters are identified

RE estimator. Assumptions of common trends. It assumes ci are random draws from the population like Xs. We also assume Cov(xik,ci)=0. Both FE and RE assume strict exogeneity of X|c. Under RE, OLS no longer the most efficient estimator but still biased. Use feasible GLS to estimate the RE estimator, this gives us the most efficient unbiased estimator. Feasible GLS: In OLS, least squares, all observations weighted equally. Easiest example is heteroskedasticity. When we use GLS, we assume we know the form of heteroskedasticity (h_i). Divide our equation by sqrt(h_i), we get a homoskedastic equation. This gets us a model with homoskedastic errors. Take LS of the second equation. This is now weighted. We need to estimate h_i, so use hat. FGLS.

Outline how you would estimate panel data if you believe the effects are correlated with the error term?

Use a FE estimator. This assumes allowed correlation between ci and Xi (eq.), and that ci are fixed parameters to be identified. Both assume strict exogeneity of X|c. The FE estimator estimates using dummy variables for each cross sectional unit using OLS

Outline a test that would allow you to choose between random and fixed effects approaches?

Hausman test. Null hypothesis is that RE is best. No correlation. If low p value, significant - RE is invalid, use FE.

Econometrics I Revision Flashcards

(28 cards)