Econometrics I Revision Flashcards

1
Q

Explain and derive the OLS estimator

A

Minimise the sum of squared deviations between the actual and the sample values. You choose β to minimise this between y and its linear approximation given by its conditional expectations.

(y-Xβ)’(y-Xβ)
dy/dx
-2X’y + 2X’Xβhat = 0
βhat = (X’X)-1 X’y
d2y/d2x
2X’X is positive definite

Assumptions:

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Explain and derive MM estimator

A

MM estimator finds values for β that ensures the sample counterpart of the population moment condition E(X’e)=0 is satisfied:
E(X’e)=1/nX’(Y-Xβ) = 1/n(X’y-X’Xβ) = 0

βhat = (X’X)-1 X’y

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What does (X’X)-1 X’X equal

A

I - the identity matrix

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Explain the law of iterated expectations

A

E[E[X1|X2]] = E[X1]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Explain what random sampling means

A

The population model has been specified and an independent, identically distributed (iid) sample can be drawn

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Explain what an unbiased estimator means

A

E(βhat) = β

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Explain zero conditional mean assumption

A

Population orthogonality condition E(e|X)=0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Explain when X’X is nonsingular

A

If non-singular, the linear projection of y on Xn always exists and is unique

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Show the equation when an estimator is consistent

A

plim(βhat) = β

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are the assumptions of the linear regression model?

A
  1. Population orthogonality condition: E(e|x)=0
  2. Full rank: x is an nxk+1 matrix
  3. Linearity: the true model is y=Xβ+e
  4. Spherical disturbances: homoscedasticity and non-autocorrelation E(ee’|X) = σ^2 I
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Explain the BLUE properties

A

Best: βhat is more efficient than any other unbiased estimator. V(β~) - v(βhat) is a positive definite matrix

Linear: βhat is a linear function

Unbiased: E(βhat) = β

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Explain multicollinearity

A

It implies the columns of X are linearly independent, X’X will not be invertible, OLS parameters are not identified

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Explain asymptotic inference

A

If the small sample distribution of an estimator is unknown we can use an asymptotic approximation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Explain the Central Limit Theorem

A

If we have an infinite sequence of iid random variables, no matter what their distribution, in the limit they are normally distributed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Show that βhat is an unbiased estimator of β

A

βhat = (X’X)-1X’y
E(βhat) = E(X’X)-1X(Xβ+e)
Expand…
Identity matrix!
E(βhat) = E((X’X)-1X’e)
Because of Law of iterated expectations:
E(E(X1|X2)) = E(X1)
E(βhat) = (X’X)-1X’E(E|X)=0
E(βhat) = E(β) = β

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Show that βhat is a consistent estimator of β

A

An estimator βhat is consistent if and only if plimβhat = β
βhat = (X’X)-1X’y = β+(X’X)-1X’e = β+(X’X/n)-1(X’e/n)
plimβhat = plim(…)= plim(β) + plim(…) = plim(β) + plim(X’X/n)-1)plim(X’e/n)
Using law of large numbers:
lim(X’X/n)=Q=E(X’X)
lim(X’E/n)=0=E(X’e)
plim(βhat)=β + (Q-1 x 0) = β

17
Q

Explain the assumptions of unbiasedness and consistency of a linear estimator

A
  • Linearity in parameters
  • Full rank of X: X is an nxk matrix
  • Random sampling
  • Expected value of errors conditional on X is zero: E(e|X) = 0
18
Q

What are the consequences for OLS estimation if the actual variance covariance matrix is σ2Ω instead of σ2I?

A

The new variance is: (see booklet)
Whilst βhat remains unbiased and consistent, it will no longer be best or asymptotically efficient. This implies there is an alternative estimator with minimum variance.

Although it is unbiased, its variance will be based on the wrong expression i.e. σ^2(X’X)-1. This implies any attempts to use standard t, F, or Wald tests will lead to inaccurate results

19
Q

Provide an illustration of what the matrix Ω would look like if the errors are heteroscedastic. How would you proceed?

A
  • see diagram in booklet.

How to correct for it:
- Correct standard errors. Use White’s heteroscedastic robust standard errors (explain?)
- If you know the precise form of heteroskedasticity, you can correct the estimator for it using GLS/WLS:

  • In OLS, least squares, all observations weighted equally. Easiest example is heteroskedasticity. When we use GLS, we assume we know the form of heteroskedasticity (h_i). Divide our equation by sqrt(h_i), we get a homoskedastic equation. This gets us a model with homoskedastic errors. Take LS of the second equation. This is now weighted.
20
Q

What are the implications of statistical inference if the error terms are correlated within clusters in data? Give an empirical example. How would you proceed?

A

Error terms are no longer independent. OLS is no longer efficient and the standard error is downward biased.
e.g.) test scores and parental income. If there are school/teacher effects or teacher effects, there will be some correlation within a school or a class.
You can proceed by clustering standard errors at the level you are concerned about correlation (school/class level).
You can measure the size of the bias via the Moulton factor:
- See booklet for equation: How much larger is the clustered SE compared with the normal one?

As ρu and number of observations per group rises, so does the magnitude of the bias.

21
Q

Assume z, a nxm matrix of valid instruments can be chosen for x. What conditions must the instruments in z satisfy?

A
  1. Exclusion restriction. z is uncorrelated with the error term. E(z1’e)=0
  2. z has to be partially correlated with the endogenous regressor, X.
    - see booklet for equation
22
Q

Derive a consistent estimator for β using 2SLS

A

If we have m instrumental variables for X such that E(Zh’e)=0. Assuming Zk is partially correlated with X, we have a countless number of IV estimators Z1…Zm.

Of all all possible linear combinations, 2SLS chooses the one most correlated with X. The linear projection of X on Z:
X = δ0 + δ1X1 + δ2X2 + θ1Z + θ1Z1 + … + θmZm + rk
Since rk has zero mean and is uncorrelated with all other exogenous variables:
X = δ0 + δ1X1 + δ2X2 + θ1Z + θ1Z1 + … + θmZm

We can consistently estimate the reduced form making the standard OLS assumption:
Xhat = δhat0 + δhat1X1 + δhat2X2 + θhat1Z + θhat1Z1 + … + θhatmZm

The IV estimator can be defined as:
E(Xhat’e) = 1/n Xhat’ehat = 1/n Xhat’(y-xβhat) = 0

βhat = (Xhat’X)-1Xhat’y

23
Q

Derive a consistent estimator for β using Generalised Method of Moments

A

We have m population moment conditions
E(Z’e) = (…) = 0
With sample analogues:
- See booklet for equation
Solve for β by minimizing the quadratic form:
Q(β) = (…)
W is given by the asymptotic covariance matrix of the moment condition

Since ei are i.i.d., sigma2 will weight each moment the same.

Sub into quadratic form:
- See booklet

24
Q

Give an example of an application where you need to develop a consistent β estimator when there is an instrumental variable. Explain the identification problem and propose two potential instruments that could be used

A

The effect of education on earnings, affected by underlying ability. You cannot measure this.

As a result you will have biased estimates that are inconsistent because of OVB.

You can solve this with an instrumental variable that affects education but not outcome:
- Time spent in compulsory school years. This is not based on ability
- Distance from school. Someone close to school may undergo more education.

25
Q

Explain what is meant by the weak instruments problem. What are the consequences for instrumental variables estimation

A

There is not that strong a relationship between Z and X. As a result, the result of 2SLS estimation is bias - towards the OLS estimates.

If there isn’t much of a relationship between X and Z, this is more similar to OLS. The size of the bias:

Use an F-test. A low F implies a high level of bias. Adding useless instruments will increase bias.

26
Q

Outline how you would estimate panel data if you believe X effects are random and independent. How do you know if parameters are identified

A

RE estimator. Assumptions of common trends. It assumes ci are random draws from the population like Xs. We also assume Cov(xik,ci)=0.

Both FE and RE assume strict exogeneity of X|c.

Under RE, OLS no longer the most efficient estimator but still biased. Use feasible GLS to estimate the RE estimator, this gives us the most efficient unbiased estimator.

Feasible GLS:
In OLS, least squares, all observations weighted equally. Easiest example is heteroskedasticity. When we use GLS, we assume we know the form of heteroskedasticity (h_i). Divide our equation by sqrt(h_i), we get a homoskedastic equation. This gets us a model with homoskedastic errors. Take LS of the second equation. This is now weighted.

We need to estimate h_i, so use hat. FGLS.

27
Q

Outline how you would estimate panel data if you believe the effects are correlated with the error term?

A

Use a FE estimator. This assumes allowed correlation between ci and Xi (eq.), and that ci are fixed parameters to be identified.

Both assume strict exogeneity of X|c.

The FE estimator estimates using dummy variables for each cross sectional unit using OLS

28
Q

Outline a test that would allow you to choose between random and fixed effects approaches?

A

Hausman test. Null hypothesis is that RE is best. No correlation. If low p value, significant - RE is invalid, use FE.