Time Series Flashcards
What is Time Series data?
An ordered sequence of observations, typically equally spaced in time, possible through space as well.
What is the general equation for statistical forecasting?
Time Series (Y) = Signal + Noise
What is the difference between Trend, Seasonal and Cyclical patterns?
- Trend: Naturally going up or down2. Seasonal: something that happens w/ a consistent frequency and definitive pattern (Temp over 24 hours)3. Cyclical: Natural ebb and flow but not w/ consistent frequency (US economy)
What are two methods of Time Series decomposition and 3 common decomposition techniques to calculate pattern effects?
Decomposition:1. Additive: Y = Trend + Seasonality + Error2. Multiplicative: Y = Trend * Seasonality * Error or log(Y) = log(T) + log(S) + log(E)Decomposition Techniques:1. Classical Decomposition2. X-12 ARIMA3. STL (Seasonal, Trend and LOESS Estimation)
What is autocorrelation?
Correlation of a variable iwth itself across time.
What is White Noise?
If all signal (pattern) has been accounted for then the errors will be independent (no pattern in residuals). This is White Noise. White noise time series follows a Normal distribution with mean of zero and positive, constant variance in which all obs are independent after accounting for pattern.
What is the test for White Noise?
“Ljung-Box test. “
What does LOESS stand for?
LOcal regrESSion. Builds regression line on small sections of graph, can handle outliters very well b/c it can smooth over averages better.
In time series, what is a hold-out sample?
Always at the end of the time series data, doesn’t typically go beyond 25% of the data. Ideally an entire season should be captured in a hold-out sample. You need to hold out whatever you’ve been asked to forecast. year, quarter, month, week, day. 1. Create training and validation data set2. Derive a set of candidate models3. Calculate the chosen accuracy statistic by forecasting validation data4. Pick model with best accuracy
What are the 4 model diagnostic stastics used to evaluate TS model accuracy?
“Can also use Information Criterion methods: AIC, SBC, “
Describe the difference between goodness of fit and accuracy?
goodness of fit is calculated on training dataaccuracy is calculated on hold-out data
What are the characteristics of a good time series model?
- Highly correlated with actual series values2. Exhibit small forecast errors3. Capture important features of the original time series.
What is the goal and statistical significance of Exponential Smoothing Models?
Goal is prediction of one time period into the future. math is focused on finding best THETA, bounded between 0 and 1. Focus weights most recent data with lower weight the further back in time. No statistical significance as this was developed by mathematicians, not statisticians, so no statistical distribution in mind.
What are 3 Exponential Smoothing Models and what are they focused on?
- Single2. Linear/Holt (Trend)3. Holt-Winters (Trend and Seasonality)
How is the optimal THETA found for ESMs?
THETA that minimizes sum of squared errors is chosen.
What’s the difference between time series decomposition, ESM and ARIMA?
Time series removes noise and leaves trends and explores dataESM models on time period into the future for forecatingARIMA is statistically based and allows patterns to reveal themselves through correlation and stationarity.
What correlation does Autocorrelation describe?
The correlation b/w Y ant Y(t-1), same variable separated by k-points in time.
What is implied if the first correlation function is significant?
2 consecutive points are correlated.
What does a positive or negative AFC(1) imply?
Positive: High today implies high tomorrow (continued trend)Negative: High today implies low tomorrow(reversal of trend)
What’s the difference b/w the autocorrelation function (ACF), partial autocorrelation function (PACF) and the inverse autocorrelation function (IACF)?
ACF: function of all autocorrelation b/w 2 sets of obs through time for all values of k (time steps)rho(k) = Corr(Yt, Yt-k)PACF: correclation b/w 2 sets of obs separated by k-points in time after adjusting for all previous autocorrelations b/w the two points. PACF is conditional and tries to measure the direct relationship b/w 2 sets of obs w/o the influence of other sets in between. Uses regression to identify relationships. phi(k) = Corr(Yt, Yt-k | Yt-1, Yt-2,…, Yt-k-1)IACF: overemphasizes seasonal effects - very helpful in identifying seasonal data. Similar to PACF but different method of calculation using linear algebra. Typically has opposite sign to PACF.
How is PACF like linear regression?
Run multiple linear regression with your lags and the coefficients are the partial autocorrelation (phi). Using regression isolates the effects of each lag.
What’s the relationship b/w PACF and ACF?
They are complementary, using both you can determine how far back the relationship really goes.
What are the naive models for regression and time series?
Regression: mean of yTime Series: y(t-1), the previous time period.
What is stationarity?
Like independence assumption in regression, stationarity should exist in the background. A stationary time series has constant mean and variance. A time series w/ long term trend or seasonal data cannot be stationary b/c the mean of the series depend on the time the value is observed.There must be randomness of the mean, this is a condition of stationarity.
For season data, the mean is 0. Is there stationarity?
There is no randomness to the mean, so a pattern exists - no stationarity.
What do all stationary models revert to?
the mean of the series.