9 Flashcards
How to fix autocorrelation / serial correlation assumption when it stipulates that residuals do not correlate with their lags
Use the Durbin-Watson test and / or the Breusch-Godfrey test. And as discussed it can be fixed by:
1. Fixing functional misspecifications (including making sure the data is stationary) or adding more variables / observations
2. Using an alternative regression form
3. Using a robust standard error:
A - (HAC - autocorrelation and heteros..)
B - (HC - only heteros..)
What is pseudo-causality?
Pseudo-causality β if a change in one variable at time π‘ is preceded by a change in another variable at time π‘βπ, where π<π‘. Pseudo causality provides some evidence for a possible causal relationship.
What is Granger causality?
Granger causality is a circumstance in which one time series variable consistently and predictably changes before another variable.
What is important about Granger causality?
To run Granger causality you need to do several things, but the general idea behind it is that you build two regressions with a lot of lags where in one π is the dependent variable and π is the independent one, and in another π now is the dependent and π is the independent variable.
To use the standard Granger causality test, π and π variables must be stationary. If they are not stationary, you need to take an appropriate number of differences until they become stationary.
What are the steps to run Granger causality?
- Check the stationary of the variables you will be using.
- Make them stationary if they are not.
- Using the two variables build a lot of regressions using different number of lags (i.e. each regression should only contain the dependent and the independent variables where the formal is lagged at different levels).
- Identify which number of lags produces the best regression using AIC (i.e. Akaike), BIC (Scwartz criterion), or similar.
- Create two regressions with the optimal selected number of lags, in one π is dependent while in another its π.
- Figure out if π explains π better or vice versa to identify what Granger causes what. If you get results where everything causes everything this might imply a cyclical Granger causal relationship between variables (i.e. there is a feedback loop).
Cross-sectional forecasting steps
- Specify and estimate an equation that has as its dependent variable the item that we wish to forecast.
- Obtain values for each of the independent variables for the observations for which we want a forecast and substitute them into our forecasting equation.
Forecasting accuracy methods
- Mean absolute percentage error (MAPE) β the simplest method that takes the mean of the percentage errors in absolute values.
- Root mean square error criterion (RMSE) β an alternative to MAPE that is calculated by squaring the forecasting error for each time period, averaging these squared amounts, and then taking the square root of this average. One advantage of the RMSE is that it penalizes large errors.
Some things to consider when forecasting?
- Unknown number of variables, unrealistic to expect them
- Point estimation is not very accurate
- Feedback loops
What is ARIMA?
AutoRegressive Integrated Moving Average and it is a forecasting technique that uses lags of the dependent variable for prediction, completely ignoring the independent variables.
Why and when to use ARIMA
It is a valuable tool when:
1. Little or nothing is known about the dependent variable being forecasted
2. Not clear how different independent variables affect the dependent variable
3. When all that is needed is a relatively short-term forecasting
What are elements of ARIMA?
Autoregressive (AR) β expressing a dependent variable π_π‘ a as function of its past values.
Integrated (I) β refers to the differentiation that has to be taken to make the data stationary.
Moving-averages (MA) β expresses a dependent variable π_π‘ as a function of past values of the error term.
What are different ARIMA equations?
- First-order autoregressive model - this model assumes stationarity and that the value of π can be predicted from its lagged value
- Random walk (a.k.a. ARIMA(0,1,0)) - if the seriesYis not stationary, the simplest possible model for it is a random walk model
- Differenced first-order autoregressive model (a.k.a. ARIMA(1,1,0)) β if the errors of a random walk model are autocorrelated, perhaps the problem can be fixed by adding one lag of the dependent variable to the prediction equation
- Simple exponential smoothing model with growth (a.k.a. ARIMA(0,1,1)) β Another strategy for correcting autocorrelated errors in a random walk model is suggested by the simple exponential smoothing model
- Differenced first-order autoregressive model with exponential smoothing (a.k.a. ARIMA(1,1,1)) β the model considered that there is an autocorrelation process, data becomes stationary after a single difference, and there is a need to smooth the model with error terms
How to decide on the starting value of AR (ARIMA)?
The highest candle at ACF plot
How to decide on I (ARIMA)?
- Run the Augmented Dicky-Fuller and KPSS test on your variable
- The first difference (in raw terms)
- Run the tests again, if the variable is stationary use I(1) if they are not, repeat the process until the variable becomes stationary. If π is the number of times you took the difference, then in the end you have to use I(π).
If the variable is stationary in raw terms, then we use I(0) and instead of ARIMA we use ARMA.
How to decide on the starting value of MA (ARIMA)?
USE PACF plot and review the number of candles