Time Series Analysis Flashcards
Time Series Analysis
A quantitative forecasting method to predict future value
Uses numerical data obtained at regular time intervals
projections based on past and present observation
Components of time series analysis
trend, cycles, seasonal, and random factors
Cyclical vs Seasonal
Regular patterns when looking at yearly increments
VS
upward and downward swings of varying lengths
Random Component
-erratic, nonsystematic, random, or residual flucuations.
Short duration and non-repeating
Due to nature or accidents
Univariate vs Multivariate Time Series Models
Univariate
Uses only one variable
Cannot use external data
Based only on relationships between past and present
Multivariate
Uses multiple variables
can use external data
based on relationships between past and present AND between variables
Time Series Decomposition
A technique to extract multiple types of variation from your dataset.
There are 3 important components in the temporal data of a time series - seasonality, trend, and noise (can not be explained by either season or trend)
It results in a graph of each component
Trend
a longterm upward or downward pattern in your data
Autocorrelation
the correlation between a time series’ current value and its past value. If a correlation exists, you can use present values to better predict future values
Positive autocorrelation - a high value now is likely to yield a high value in the future and vise versa
Negative autocorrelation - an inverse relationship. A high value implies a low value tomorrow and vice versa. Think about population control via competition in the wild.
ACF = autocorrelation function
PACF
An alternative to ACF. Rather than giving the autocorrelations, it gives you PARTIAL autocorrelations. It is partial because with each step back in the past, only additional autocorrelation is listed.
ACF contains duplicate correlations when variability can be explained by multiple points in time.
For example, if the value of today is the same as the value of yesterday, but also the same as the day before yesterday, AFC would show 2 highly correlated steps. PACF would only the yesterday’s correlation and will remove the ones further in the past
Stationarity
A time series that has no trend. Some time series models are not able to deal with trends. You can detect non-stationarity using the dickey fuller test and you can remove non-stationarity using differencing
The mean and variance do not change over time!
Dickey Fuller Test
ADF test
p value smaller than .05 - reject the null (non-stationarity) and accept the alternative (trend)
Differencing
Removing the trend from your time series, the goal is to only have seasonal variation left.
This allows you to use models that can handle seasonality but not trend
One Step Time Series models
Designed to forcast only one step into the future
Can generate multistep forecasts by windowing over predictions
can be less performant for multistep forecasts
Multistep Forecasts
designed to forecast multiple steps into the future
no need to window over predictions
more appropriate for multistep forecasts
Classical Time Series Models
These models are strongly based on temporal variation inside a time series and they work well with univariate time series
ARIMA Family falls in this category
ARIMA Family
Autoregression (AR) - Uses a regression model that explains a variables future based on its past values. It only uses the value of the previous step to predict the current value
Moving Average (MA) - Same in concept as AR - uses past to predict future. BUT the past values used are not the value of the variables, rather it uses the prediction error in the previous steps to predict the future
Autoregressive moving average (ARMA)
Combines both AR and MA into one. It uses both previous value and prediction errors from the past to predict the future.
Requires stationary data!! Must remove trend via differencing before using
Autoregressive Integrated Moving Average
ARIMA adds auto differencing right into your model, so you can feed it non-stationary data
Smoothing
smoothing is a basic statistical technique that can be used to smoothen out time series. Time series patterns often have a lot of long term variability but also short term variability (noise)
Smoothing reduces the short term variability so you can see the long term trends more easily
Simple moving average
the simplest smoothing technique. It replaces current value and surrounding value with a very local average of a few pre and post values
Simple Exponential smoothing
an adaptation of moving average. But rather than taking a simple average, it takes a weighted average (a value that is further back will count less and a more recent value will count more)
The weights are chosen subjectively
When trends are present (non-stationary) you should avoid using this technique.
Double Exponential smoothing
You can use this for smoothing non-stationary data.