Time_series Flashcards

1
Q

Describe briefly the time line for a business time series project

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are possible pitfalls when using common metrics to evaluate time-series data?

A

One can be non stationarity and exploiting autocorrelation . It is very easy to get close to the right time series by simply shifting the values. It is way harder to predict the difference between values more than the actual values.

This are usually called persistence model. Simple autocorrelation plot can highlight this.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the ARIMA formula? explain it.

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is BIC?

A

It is very very similar to AIC

The Bayesian Information Criterion (BIC) is defined as

k log(n)- 2log(L(θ̂)).

Here n is the sample size; the number of observations or number of data points you are working with. k is the number of parameters which your model estimates, and θ is the set of all parameters.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is exponential smoothign model?

A

Exponential smoothing uses a similar logic to moving average, but this time, a different decreasing weight is assigned to each observations.

y = ax_t +(1-a)y_t-1

Here, alpha is a smoothing factor that takes values between 0 and 1. It determines how fast the weight decreases for previous observations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Can we use regression instead of time series analysis for time series data?

A

Yes and they actually even perform better because we need to take into account a lot of exogenous factors which have impact on sales.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What cross-validation technique would you use on a time series data set.

A

Instead of using k-fold cross-validation, you should be aware to the fact that a time series is not randomly distributed data — It is inherently ordered by chronological order.

In case of time series data, you should use techniques like forward chaining — Where you will be model on past data then look at forward-facing data.

fold 1: training[1], test[2]

fold 1: training[1 2], test[3]

fold 1: training[1 2 3], test[4]

fold 1: training[1 2 3 4], test[5]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is differencing and what is it used for ?

A
  1. Differencing : This is the commonly used technique to remove non-stationarity. Here we try to model the differences of the terms and not the actual term. For instance,

x(t) – x(t-1) = ARMA (p , q)

This differencing is called as the Integration part in AR(I)MA. Now, we have three parameters

p : AR

d : I

q : MA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Was it shown that modern statistical method like RNN deep NN or LSTM perform better than classical methods?

A

Actually not. Several papers showed that complex method have not surpassed classical methods yet

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is autocorrelation?

A

Informally, autocorrelation is the similarity between observations as a function of the time lag between them.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Is this an AM or AR model? What order should you use?

A

The blue line above shows significantly different values than zero. Clearly, the graph above has a cut off on PACF curve after 2nd lag which means this is mostly an AR(2) process

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Do you have to re-train time series forecasting more often than traditional problems?

A

Yes. Because intrinsticaly the distribution the data are sampled from is changing. Its mean and variance change with time so differently from other problem you have tro retrain almost everytime you make a prediction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

how would you increase accuracy of your model if the client needs it?

A
  • Trying new models like NN LSTM regression etc and do bagging of them or stacking
  • Ask for businees insight what data or market are important? Are there outliers? what can they be correlated with? Promotions ? big events?
  • Explore more data
  • have a clear and optimized loss funcion being carefull not to overfit
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is important when meeting a client for a new project?

A

Understand the data: what data are available what can be available? The maturity of a product to predict is very important

Time scale both on the project lenght and what time of forecast are needed. How far in time we need the prediction to be made?

What aspect should we focus on? sales stocks earnings?

Is it better to underforcast or over forecast?

How much variance can we tolerate? this will be related to the amount of time we have to develop the project?

For production: what is the scale of data? how quickly and how often do we need new forecast?

We can use business experience: what markets or external data are important in your experience? like regulations or stocks prices or weather etc

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Think about an example ACF plot`

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is Vector autoregressiona and what is it used for?

A

The Vector Autoregression (VAR) method models the next step in each time series using an AR model. It is the generalization of AR to multiple parallel time series, e.g. multivariate time series.

The notation for the model involves specifying the order for the AR(p) model as parameters to a VAR function, e.g. VAR(p).

The method is suitable for multivariate time series without trend and seasonal components.

18
Q

What are the classical steps to forecast a time series?

A
19
Q

What important diagnose info can we get from residual’s model?

A
20
Q

How can you stationarize a time series?

A

Detrending, you compute the trend with a model like logistic regression and then you remove it

Differencing remove a lagged version of the time series to the time series

21
Q

What is a SARIMA model and what do the letters stands for?

A

SARIMA is actually the combination of simpler models to make a complex model that can model time series exhibiting non-stationary properties and seasonality.

S seasonality

AR autoregressive model

I integration order :The parameter d represents the number of differences required to make the series stationary.

MA Moving Average model

22
Q

What is an autoregression model?

A

This is basically a regression of the time series onto itself. Here, we assume that the current value depends on its previous values with some lag. It takes a parameter p which represents the maximum lag. To find it, we look at the partial autocorrelation plot and identify the lag after which most lags are not significant.

23
Q

Are AR or MA models applicable to non-stationary series?

A

you should remember, AR or MA are not applicable on non-stationary series.

In case you get a non stationary series, you first need to stationarize the series (by taking difference / transformation) and then choose from the available time series models.

24
Q

What is peculiar about sales forecasting?

A
  • • We need to have historical data for a long time period to capture seasonality. However, often we do not have historical data for a target variable, for example in case when a new product is launched. At the same time we have sales time series for a similar product and we can expect that our new product will have a similar sales pattern
  • • Sales data can have a lot of outliers and missing data. We must clean outliers and interpolate data before using a time series approach.
  • • We need to take into account a lot of exogenous factors which have impact on sales.
25
Q
A
26
Q

What metrics can be used to analyze time series?

A

Mean Absolute Error (MAE): It is the average of the absolute difference between the predicted values and observed values.

Root Mean Square Error (RMSE): It is the square root of the average of squared differences between the predicted values and observed values.

MAE is easier to understand and interpret but RMSE works well in situations where large errors are undesirable. This is because the errors are squared before they are averaged, thus penalizing large errors.

27
Q

What is the conceptual diference between AR an MA model? (in terms of autocorrealtion)

A

The primary difference between an AR and MA model is based on the correlation between time series objects at different time points. The correlation between x(t) and x(t-n) for n > order of MA is always zero. This directly flows from the fact that covariance between x(t) and x(t-n) is zero for MA models (something which we refer from the example taken in the previous section). However, the correlation of x(t) and x(t-n) gradually declines with n becoming larger in the AR model. This difference gets exploited irrespective of having the AR model or MA model. The correlation plot can give us the order of MA model.

28
Q

What is Holt Winter’s Exponential Smoothing​?

A

The Holt Winter’s Exponential Smoothing (HWES) also called the Triple Exponential Smoothing method models the next time step as an exponentially weighted linear function of observations at prior time steps, taking trends and seasonality into account.

The method is suitable for univariate time series with trend and/or seasonal components.

29
Q

What can be used to compare different time series models if there is not enough data to split in train and test sets?

A

BIC - AIC

The concept of model complexity can be used to create measures aiding in model selection. There are a few measures which explicitly deal with this trade-off between goodness of fit and model simplicity, for instance the Akaike information criterion (AIC) and the Bayesian information criterion (BIC). Both penalize the number of model parameters but reward goodness of fit on the training set, hence the best model is the one with lowest AIC/BIC. BIC penalizes model complexity stronger and hence favors models which are “more wrong” but simpler. While this allows to do model selection without a validation set, it can be strictly applied only for models which are linear in their parameters, even though it typically also works in more general cases, e.g. for general linear models such as logistic regression

30
Q

Can seasonality be incorporated in a AR MA model?

A

Yes hence the SARIMA model.

31
Q

What is an ACF plot?

A

ACF is a plot of total correlation between different lag functions. For instance, in GDP problem, the GDP at time point t is x(t). We are interested in the correlation of x(t) with x(t-1) , x(t-2) and so on.

32
Q

What is stationarity? Is it a desired property?

A

A time series is said to be stationary if its statistical properties do not change over time. In other words, it has constant mean and variance, and covariance is independent of time.

Ideally, we want to have a stationary time series for modelling. Of course, not all of them are stationary, but we can make different transformations to make them stationary.

33
Q

What are 3 simple models to model a time series?

A

moving average

exponential smoothing

ARIMA

34
Q

How can we figure out if Is it an AR or MA process?

A

using Total Correlation Chart (also known as Auto – correlation Function / ACF). ACF is a plot of total correlation between different lag functions.

In a moving average series of lag n, we will not get any correlation between x(t) and x(t – n -1) . Hence, the total correlation chart cuts off at nth lag. So it becomes simple to find the lag for a MA series. For an AR series this correlation will gradually go down without any cut off value.

35
Q

why autoregressive models can not be used with non stationary series?

A

An important reason is, autoregressive forecasting models are essentially linear regression models that utilize the lag(s) of the series itself as predictors.

We know that linear regression works best if the predictors (X variables) are not correlated against each other. So, stationarizing the series solves this problem since it removes any persistent autocorrelation, thereby making the predictors(lags of the series) in the forecasting models nearly independent.

36
Q

Is this an AR or MA model? What order should you use?

A

Clearly, the graph above has a cut off on ACF curve after 2nd lag which means this is mostly a MA(2) process.

37
Q

Can you concatenate exponential smoothing?

A

Yes getting double or triple exponential smoothing.

Double exponential smoothing is used when there is a trend in the time series

38
Q

What is de-trending?

A

Detrending : Here, we simply remove the trend component from the time series. For instance, the equation of my time series is:

x(t) = (mean + trend * t) + error

We’ll simply remove the part in the parentheses and build model for the rest.

39
Q

Can decision tree be used for time series forecast?

A

Yes because time series can usually be predicted with linear regression. Of course they are easily explainable but not very acurate go with xgboost or random forest

40
Q

What is AIC? What is the formula?

A

The Akaike Information Criterion (AIC) lets you test how well your model fits the data set without over-fitting it.

The AIC score rewards models that achieve a high goodness-of-fit score and penalizes them if they become overly complex.

By itself, the AIC score is not of much use unless it is compared with the AIC score of a competing model.

The model with the lower AIC score is expected to strike a superior balance between its ability to fit the data set and its ability to avoid over-fitting the data set.

AIC = 2k -2 ln(L) where k are the free parameters and L is the maximum value of the likelihood of the model (is a measure of the goodness-of-fit of the model)

41
Q

How to test if a process is stationary?

A

You may have noticed in the title of the plot above Dickey-Fuller. This is the statistical test that we run to determine if a time series is stationary or not.

Another very simple technique is to perform a rolling mean

42
Q

Why do I care about ‘stationarity’ of a time series?

A

The reason I took up this section first was that until unless your time series is stationary, you cannot build a time series model. In cases where the stationary criterion are violated, the first requisite becomes to stationarize the time series and then try stochastic models to predict this time series. There are multiple ways of bringing this stationarity. Some of them are Detrending, Differencing e