Econometrics II Flashcards
Explain the Classic Linear Model Assumptions and what they mean for estimation
- Linear in parameters
- Zero conditional mean E(Ut|X)=0
- No perfect collinearity
(These three mean the OLS estimators are unbiased) - Homoskedasticity
- No serial correlation
(These five are the Gauss-Markov assumptions. Under this, the estimation is BLUE) - Normality
Explain the conditions for a time series being covariance stationary (useful for variance etc.)
1) Constant mean
2) Constant variance
3) Constant covariance for all lags k
So the key properties of distribution don’t change over time
Explain the conditions for {et} to be a white noise process
Derive the mean, variance, covariance, and ACF of an MA(1) model
Derive the mean, variance, covariance, and ACF of an AR(1) model
Describe the ACF and PACF of MA(1) and AR(1) processes. How do you decide if different models provide a similarly good fit?
We need to use a more parsimonious model. Implement diagnostic tests: either through overfitting = ARMA(p,q+1) or ARMA(p+1,q) and test significance, or by Ljung-Box tests = A Q-test where if rehected, the model building process has to be repeated.
Also test goodness of fit with AIC and BIC which measures the trade-off between fit and parsimony. The lower the values of this, the better.
Explain the forecasting principles, and how forecasts work with ARMA models. How do you determine forecast error?
How do you evaluate which forecast to choose?
An optimal forecast should minimize the mean squared error representing the loss function.
Forecasting with ARMA models are provided based on linear projection: AR represented as linear functions of p recent values of Y, MA represented as linear functions of infinite values of Y.
Forecast error is the difference between the realised values and its forecast at time t
To evaluate which forecast to choose, split the sample in two, get the one-step ahead until the end. Then calculate the MSPE, and pick the lowest
Explain the nonstationary processes
1) Random walk. Change in the variable between today and tomorrow is completely random
2) Random walk with drift. We insert a constant (i.e. a drift). Yt consists of two nonstationary components: a deterministic trend component, and Σεi a stochastic trend
3) Trend stationary process. Yt = α0 + α1t + εt
Test for unit roots using the Dickey-Fuller test. Under null, unit root (theta) = 1
- I hate this section
Outline SEM and the relevant assumptions
Allow some RHS variables to be endogenous. After getting the structural equations, we see the system is interdependent. The completeness of a system requires the number of endogenous variables is equal to the number of equations.
Why can you not use OLS with SEM? What do you do instead?
After putting into reduced form, the endogenous variables are correlated with the error terms. This means the assumptions of the classical regression model are violated, so the OLS estimator is inconsistent.
Instead, we use an instrumental variable (IV). The only variable uncorrelated with ε is x, so we use this as an instrument for β.
- Triangular system rubbish can bugger off
Explain the problem of identification with SEM
A structure is unidentified is you have theories that are observationally equivalent. This is when you derive the same reduced form from different structures. This is NOT affected by the size of the sample.
Resultingly, we get “false” structural matrices and disturbances. We get a reduced-form coefficient matrix corresponding to the “false” structure
Explain how to solve the identification problem with SEM
We need non-sample information.
Additional information comes in several forms:
1) Normalisations: In each equation, one variable has a coefficient of 1. With this, there are M(M-1), not M^2 undetermined values in Γ
2) Identities: Variable definitions or equilibrium conditions imply all coefficients are known
3) Exclusions: Omission of variables from equations - places zeroes in B and Γ
4) Linear restrictions: restrictions on structural parameters may also rule out false structures
5) Restrictions on the disturbance covariance matrix: similar to restrictions on the slope parameters
Explain the rank and order conditions for identifying SEM
For there to be a solution, there must be at least as many equations as unknowns.
Order condition for identification of equation j:
- Kj(star)>=Mj
- This condition is necessary but not sufficient. It ensures there is at least one solution, but doesn’t mean there is only one
Rank condition for identification:
- (see equation below)
- Imposes a restriction on a submatrix of the reduced-form coefficient matrix. It ensures there is exzctly one solution for the structural paramters, given the reduced-form
A rule-of-thumb for checking the rank and order conditions is to check whether every equation has its own predetermined variable. There are three cases:
1) Underidentified: Kj(star) < Mj or rank condition fails
2) Exactly identified: Kj(star)=Mj rank condition is met
3) Overidentified: Kj(star)>Mj rank condition is met
Explain two fundamental methods of estimating SEM
1) Single equation methods: the limited information estimators are constructed for each equation individually:
- 2SLS. Endogenous variable regressed on exogenous in first stage. In second stage, system equations esimated by OLS. Consistently estimates parameters, so this is almost universally used
- LIML. A least-variance estimator. Disadvantages: based on normal distribution, computationally more expensive. Advantages: Invariant to normalization
- Less sensitive to specification error
2) System methods: the full information estimators are used for all equations simultaneously:
- 3SLS. Residuals used to estimate the cross-equation error-covariance matrix. Constructs generalized least squares estimator (GLS)
- FIML. Assumes errors are normall distributed
- Asymptotically more efficient with no specification error
How would you analyse stability properties of a VAR(1) model?
Xt= A0 + A1Xt-1 + et
VAR in reduced form can be written using the lag polynomial:
(I-A1L)Xt = A0 + et
So we analyse stability based on the lag polynomial (I-A1L). We can get the characteristic equation in our bivariate system: det(I-A1z)=0
Stable when |z1|>1 and |z2|>1
It has a VMA representation:
xt = μ +ΣA1et