w5 Flashcards
Forecasting
prediction with time.
Cross-sectional
time is not the independent variable.
Time-series
past data is used to predict trends.
- usually done with a line chart to show trend
- regular intervals of time (quarters, months, years)
Horizontal Pattern
Mean is not changing.
Non-stationary time series
There is a jump up or down in the pattern.
Trend Pattern
increasing or decreasing over time.
Seasonal Pattern
Pattern repeats in regular intervals; repeats across season, quarters, or years.
Trend Seasonal Pattern
Has a repeating pattern but also is increasing or decreasing over time.
Forecast error
difference between actual value and predict value; like checking after the fact how well your prediction predicted the actual value.
Four measures of time series
- Mean forecast error
- Mean absolute error
- Mean square error
- Mean absolute percentage error
MFE pros and cons
pros: can be used to determine whether the model is overestimating (negative result) or underestimating (positive result).
cons: negative and positive numbers cancel each other out. Not a good overall measure.
MAE pros and cons
pros: solves MFE’s flaw. Better than MSE when you include outliers.
cons: not comparable across different models.
MSE pros and cons
pros: most popular overall model measure. Solves MFE’s flaw.
cons: not comparable across different models. Easily biased by outliers.
MAPE pros and cons
pros: comparable across models.
cons: most difficult measure to calculate.
Non-moving Methods
Naive bar casting method & Average past values casting
Naive bar casting method
simply takes the last actual value and uses that value to predict the next value. Adapts to changes quickly making it a good method for non-stationary patterns.
Average past values method
average all past actual values to predict the next value. Good for horizontal pattern.
Moving methods
Use most recent actual values to predict the next value. They include Moving average forecasting & Exponential Smoothing.
Moving Average Method
Uses k number of recent actual values, averages them, and uses that average to predict the next value. Lower K works best with frequent fluctuations while higher K works best with fewer fluctuations.
Exponential Smoothing Method
y_t+1 = y(a) * y_hat(1-a); uses the weighted most recent prediction and value to predict then next value. When there is higher variation, a should be higher. When there is less variation, a should be lower. 0 < a < 1.
Causal Forecasting
time and a non-time independent variable used to predict dependent variable.
Linear Regression for Seasonal Trend Pattern
y = B0 + B1Q1 + B2Q2 + B3Q3 + B4X + e; the time intervals are categorical dummy variables in this instance.
How are K and alpha determined?
computers try different values to maximize MSE. Where MSE is maximized, k or alpha is best.