Time_series Flashcards

Question

Answer 1

Mean Absolute Error (MAE): It is the average of the absolute difference between the predicted values and observed values. Root Mean Square Error (RMSE): It is the square root of the average of squared differences between the predicted values and observed values. MAE is easier to understand and interpret but RMSE works well in situations where large errors are undesirable. This is because the errors are squared before they are averaged, thus penalizing large errors.

Answer 2

The primary difference between an AR and MA model is based on the correlation between time series objects at different time points. The correlation between x(t) and x(t-n) for n \> order of MA is always zero. This directly flows from the fact that covariance between x(t) and x(t-n) is zero for MA models (something which we refer from the example taken in the previous section). However, the correlation of x(t) and x(t-n) gradually declines with n becoming larger in the AR model. This difference gets exploited irrespective of having the AR model or MA model. The correlation plot can give us the order of MA model.

Answer 3

The Holt Winter’s Exponential Smoothing (HWES) also called the Triple Exponential Smoothing method models the next time step as an exponentially weighted linear function of observations at prior time steps, taking trends and seasonality into account. The method is suitable for univariate time series with trend and/or seasonal components.

Answer 4

BIC - AIC The concept of model complexity can be used to create measures aiding in model selection. There are a few measures which explicitly deal with this trade-off between goodness of fit and model simplicity, for instance the Akaike information criterion (AIC) and the Bayesian information criterion (BIC). Both penalize the number of model parameters but reward goodness of fit on the training set, hence the best model is the one with lowest AIC/BIC. BIC penalizes model complexity stronger and hence favors models which are “more wrong” but simpler. While this allows to do model selection without a validation set, it can be strictly applied only for models which are linear in their parameters, even though it typically also works in more general cases, e.g. for general linear models such as logistic regression

Answer 5

Yes hence the SARIMA model.

Answer 6

ACF is a plot of total correlation between different lag functions. For instance, in GDP problem, the GDP at time point t is x(t). We are interested in the correlation of x(t) with x(t-1) , x(t-2) and so on.

Answer 7

A time series is said to be stationary if its statistical properties do not change over time. In other words, it has constant mean and variance, and covariance is independent of time. Ideally, we want to have a stationary time series for modelling. Of course, not all of them are stationary, but we can make different transformations to make them stationary.

Answer 8

moving average exponential smoothing ARIMA

Answer 9

using Total Correlation Chart (also known as Auto – correlation Function / ACF). ACF is a plot of total correlation between different lag functions. In a moving average series of lag n, we will not get any correlation between x(t) and x(t – n -1) . Hence, the total correlation chart cuts off at nth lag. So it becomes simple to find the lag for a MA series. For an AR series this correlation will gradually go down without any cut off value.

Answer 10

An important reason is, autoregressive forecasting models are essentially linear regression models that utilize the lag(s) of the series itself as predictors. We know that linear regression works best if the predictors (X variables) are not correlated against each other. So, stationarizing the series solves this problem since it removes any persistent autocorrelation, thereby making the predictors(lags of the series) in the forecasting models nearly independent.

Answer 11

Clearly, the graph above has a cut off on ACF curve after 2nd lag which means this is mostly a MA(2) process.

Answer 12

Yes getting double or triple exponential smoothing. Double exponential smoothing is used when there is a trend in the time series

Answer 13

Detrending : Here, we simply remove the trend component from the time series. For instance, the equation of my time series is: x(t) = (mean + trend \* t) + error We’ll simply remove the part in the parentheses and build model for the rest.

Answer 14

Yes because time series can usually be predicted with linear regression. Of course they are easily explainable but not very acurate go with xgboost or random forest

Answer 15

The Akaike Information Criterion (AIC) lets you test how well your model fits the data set without over-fitting it. The AIC score rewards models that achieve a high goodness-of-fit score and penalizes them if they become overly complex. By itself, the AIC score is not of much use unless it is compared with the AIC score of a competing model. The model with the lower AIC score is expected to strike a superior balance between its ability to fit the data set and its ability to avoid over-fitting the data set. AIC = 2k -2 ln(L) where k are the free parameters and L is the maximum value of the likelihood of the model (is a measure of the goodness-of-fit of the model)

Answer 16

You may have noticed in the title of the plot above Dickey-Fuller. This is the statistical test that we run to determine if a time series is stationary or not. Another very simple technique is to perform a rolling mean

Answer 17

The reason I took up this section first was that until unless your time series is stationary, you cannot build a time series model. In cases where the stationary criterion are violated, the first requisite becomes to stationarize the time series and then try stochastic models to predict this time series. There are multiple ways of bringing this stationarity. Some of them are Detrending, Differencing e

Time_series Flashcards

(42 cards)