Chapter 3 - Basic concepts - Part B Flashcards
AR(p) is given by :
yt = φ1yt−1 + φ2yt−2 + ··· + φpyt−p + εt, (1) where φ1,φ2,…,φp are unknown parameters, and εt is a standard
white noise process.
MA(q) is given by
yt = εt + θ1εt−1 + … + θqεt−q. where θ1,θ2,…,θq are unknown parameters.
ARMAX( p, q) is given by
yt = α+φ1yt−1 +···+φpyt−p+ εt + θ1εt−1 + … + θqεt−q
(+β1x1,t + β2x2,t + · · · + βkxk,t),
where xi,t, i = 1, . . . , k denote (exogenous) regressors.
How to implement ARMAX models in practice?
We estimate AR(p) with
OLS
We estimate MA(p) with
nonlinear least squares or maximum likelihood.
Remember, b_ols =
see formula slide 6.
Derive the unbiased and the variance of b_OLS and give the assumptions that were made.
E[b_OLS] = B
the diagonal elements of
V[bOLS] = (X′X)−1X′ΩX(X′X)−1
Standard errond of b_OLS are the square roots of
the diagonal elements of
V[bOLS] = (X′X)−1X′ΩX(X′X)−1
If εt is homoskedastic (E[ε2t ] = σ2 for all t) and uncorrelated (E[εtεs] = 0 for all t and s), Ω = σ2I, such that
Then V [bOLS] = σ2(X′X)−1
If εt is heteroskedastic (E[ε2t ] = σt2) and uncorrelated, we have Ω = diag(σ1^2,σ2^2,…,σt^2), such that
see formula slide 9.
What is a difference between the regressors of the classical linear regression model and an AR(p) model
The regressors of AR(p) are stochastic opposed to fixed.
What does it imply when does it imply that regressors are stochastic?
It means that exact finite-sample results which hold in the ‘classical’ linear regression model yt = x′tβ + εt with fixed xt’s do not hold in the time series context.
⇒ Asymptotic results continue to hold, however.
For example, the OLS estimator of φ1 in the AR(1) model yt = φ1yt−1 + εt, is not unbiased but remains consistent.
suppose a series yt is generated from the AR(1) model, yt = φ1yt−1 + εt,
What is the distribution of the OLS estimator of φ1?
See derivation.
√T(^φ1−φ1)∼N(0,σ^2γ0^-1 ) (ASYMPTOTIC relation!!)
Asymptotic approximation.
The exact small-sample distribution can be obtained by means of …..:
a. Monte-carlo simulation
b. see slide 15.
Two ways of model selection
- model selection criteria based on in-sample fit
- out-of-sample forecasting
Give two model selection criteria
AIC : AIC(k) = T log σˆ2 + 2k,
SIC : SIC(k) = T logσˆ2 +klogT
=> Select ARMA orders p and q that minimize AIC(k) or SIC(k)
Misspecification tests and diagnostic measures
Many tests aim to test whether the residuals of the ARMA model statisfy the white noise properties E[ε2t ]= σ2, and E[εtεs] = 0,
Misspecification tests : test of no residual autocorrelation
see slide 18 and 19
Misspecification tests : test for homoskedasticity
often based on autocorrelations of squared residuals.
⇒ If rejected, standard errors of parameters should be adjusted or heteroskedasticity should be modelled explicitly.
Misspecification tests : test of normality
see slide 20
3 different type of forecasting
- A point forecast of y_T+h
- An interval forecast : (^L_T+h|T,^U_T+h|T)
- A density forecast : f(yT+h|YT)
optimal h-step ahead point forecast depends on a …
loss function
We should use the point forecast ^y_(T+h|T)that minimize the expected value of the loss function.
The form of the loss function depends on the variable that we are forecasting.
In many cases, the relevant loss function is diffucult to specify. Thus, we use the forecast error A . Most often, we assume that the forecast user has a B , that is C. Minimising C, or the D, we find the optimal point forecast is the E of E, that is F.
a. e_(T+h|T) = y_(T+h|T) − ^y_(T+h|T)
b. squared loss function.
c. Loss_(T+h|T) = e_(T+h|T)^2.
d. Mean squared prediction error [MSPE].
e. Conditional mean of y_(T+h|T)
f. ^y_(T+h|T) = E[ y_(T+h|T)|Y_T ]
Derive the optimal point forecast of y_T+1 for an AR(1) model,
Given :
- E[εt|Yt−1] = 0
- E[εt^2 |Yt−1] = σ^2
see slide 25.
^y_(T+1|T) = E[Y_T+1 |Y_T ]
= E[φ1yT + ε_(T +1)|Y_T ]
= φ1yT
Derive the relationship between e_T+1|T and ε_T+1 in the one-step ahead point forecast in the AR(1) model.
What conclusions can you draw?
- e_T+1|T = y_(T+1) − ^y_(T+1|T) = y_T+1 −φ1y_T = ε_(T+1)
- Hence, the variance of the forecast error V[e_(T+1|T)] is equal to σ^2, which is the variance of εt and also the conditional variance V[y_(T+1)|Y_T].
what can you see about two steps ahead in the point forecast of a AR(1) model?
and 3 steps ahead?
see slide 27-28.
Generalise the point forecasts of the AR(1) model for h-steps ahead.
See slide 29.
Consider the AR(1) model with intercept and with E[ε_t|Y_t−1] = 0 and E[ε_t^2 |Y_t−1] = σ^2.
What are the optimal point forecast of y_T+1, y_T+2, and for 3 steps ahead?
What is the general h-steps ahead point forecast?
what happens when h converges to infinity (assuming |φ1| < 1) ?
see slide 30.
The converge of point forecasts (with intercept) when the forecast horizon increases is even easier seen by rewriting the AR(1) model as A.
Using this representation, derive the conclusion that when h increase AR(1) model with intercept converges to the B.
a. yt −μ = φ1(yt−1 −μ)+εt
b. unconditional mean.
see slide 32 for the derivation.
What is the effect of estimation uncertainty?
In practice, the true value(s) of the model parameter(s) are unknown. Instead, we have to use estimated parameters.
^y_(T+1|T) = ^φ1y_T.
Hence, e_(T+1|T) = y_(T+1) − ^y_ (T+1|T)
=φ1y_T + ε_T+1 − ^φ1y_T
= ε_(T+1) + ( φ1 − ^φ1 ) y_T .
What is the effect of estimation uncertainty in point forecasts?
Mostly affects the variance of the forecast error V[e__(T+1|T)].
see slide 35.
What are the effects of model misspecification in point forecasts ?
see slide 37.
How to evaluate point forecasts?
Two ways, absolute or relative evaluation.
Absolute evaluation: what is the quality of forecasts form one specific model?
Relative evaluation: wha tis the quality of the forecasts form multiple competing models, relative to each other.
What are the 3 desirable properties of the point forecasts ?
- Unbiasedness : forecast errors have the zero mean E[e_(t+1|t)] = 0 => Straightforward to examine by testing the mean of the forecast errors differs significantly form 0.
- Accuracy: MPSE should be as small as possible.
Recall that the point forecast ^y_(t+1|t) (usually) is taken to be the one which minimizes
E[e_(t+1|t)^2 ] = E[(y_(t+1) − ^y_(t+1|t) )^2]
Note that the MSPE can be decomposed as
… = variance + squared bias
- Efficiency / optimality: it should not be possible to forecast the forecast error itself with any information available at time t. see slide 44.
Evaluating point forecasts
see slide 42-43
Comparing predictive accuracy
slide 45
How to construct density forecasts?
slide 47
How to construct interval forecasts?
slide 48