Time-Series of Returns Flashcards

1
Q

What are time series returns / econometrics?

A

With time series econometrics, we study how returns develop over time. For example, if you see a good return today, does that mean the momentum will continue tomorrow or will it mean-revert. It all comes down to: Can you forecast returns ahead of time?

A time series of returns refers to a sequence of data points representing the returns of a financial asset, recorded at successive points in time, usually at regular intervals such as daily, monthly, or annually.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Advantages of Time Series Models

A

–> use only information from the variable’s own past (opposite multivariate structural models)
–> attempt to capture empirically relevant features of the observed data that may stem from a variety of different (but unspecified) structural models.
–> can be useful when structural models are inappropriate (e.g., structural variables are observ-able at lower time frequency only).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Components of Time Series Models:

A

Trend: The long-term direction of the series. In financial markets, this might reflect sustained movements up or down in asset prices.
Seasonality: Regular and predictable changes that recur every calendar period, such as quarterly or annually.
Cyclic Variations: These are fluctuations occurring at irregular intervals, influenced by economic cycles, differing from seasonality.
Random or “Noise”: These are irregular or stochastic components that are unpredictable and cannot be systematized.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

The Rational Expectations (RE) Model

A

Implication of the EMH! Under the EMH, the stock price P_t already incorporates all relevant information (i.e., markets are informationally efficient and market participants hold rational expectations).
The security price at time t+1 is a rational expectation conditional on all available information at time t plus unanticipated shocks (ε):
P_(t+1)=E_t (P_(t+1) )+ε_(t+1),E_t [P_(t+1)-E_t (P_(t+1) )]=E_t (ε_(t+1) )=0

An economic theory that assumes that individuals form their expectations for the future in a way that optimally utilizes all available information.
This means that on average, people’s forecasts of future economic variables (like financial returns) are accurate, and any errors are random and not systematically biased. In financial markets, this model implies that the current prices of assets fully reflect all publicly available information, and future price movements are only driven by new, unpredictable information, making the markets efficient.

The model’s implication that E_t (ε_(t+1) )=0 means that the forecast of P_(t+1) is unbiased (i.e., on average the forecast equals the realised price). Further, ε_(t+1) is assumed to be independent of any information known at time t (or earlier), which is known as the orthogonality principle

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

White Noise Process

A

A white noise process (ε_t) has no discernible structure. A (zero-mean) white noise process is:
1. E[ε_t ]=0(zero mean)
2. “Var” [ε_t ]=σ^2<∞ (finite constant variance)
3. “Cov” [ε_t,ε_s ]=0,”for” s≠t (uncorrelated increments, zero autocovariances)

It’s a sequence of random variables where each variable has a mean of zero, constant finite variance, and no autocorrelation between any two different times. This means that each value in the sequence is random, does not depend on previous values, and is statistically uncorrelated with other values in the sequence. White noise serves as an important concept in time series analysis, providing a baseline model of purely random variations around a constant mean.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Random Walk

A

Baseline Model for EMH
General random walk: Is defined as y_t = μ+y_(t-1)+ε_t
1. E[ε_t ]=0
2. “Var” (y_t )=σ^2<∞
3.”Cov” [ε_t,ε_s ]=0 for s≠t.

A stochastic process where the future path of a variable, such as a financial asset’s price, is independent of its past path and evolves in a series of independent, random steps. This implies that changes in asset prices are unpredictable and devoid of any systematic patterns, making them essentially random. Consequently, under a random walk hypothesis, it is impossible to consistently predict future price movements based on historical data, reflecting the efficient market hypothesis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Comparing RE and RW

A

The RW (random walk) model is an example of an RE (rational expectations) model. More specifically, the general RW is an example of an RE model with
E_r [P_(t+1) ]=μ+P_t
Even the general RW model is more restrictive than the RE, as it assumes a constant drift μ (for time-series tests we don’t need to impose any structure on what μ actually is, but you might have a compensation for risk in mind). Because the RW model is more restrictive, it makes testing easier.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Stationarity

A

Stationarity means a time series process does not wander off, i.e., if its statistical properties remain constant over time, so sample mean and covariance are roughly the same over different time intervals.

A process is weakly/covariance stationary if
1. E(y_t )=μ (mean is finite and constant across t, i.e., no trend)
2. Var(y_t )=σ^2 (variance is finite and constant across t
3. Cov(y_t,y_(t-s)) for s≠0 (covariance is finite and a function of s but not of t)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Comparison cross-sectional vs time-series models

A

A time series model captures correlation between y_t and past values y_(t-s). This differs from OLS models, where the dependent variable y is explained by the independent variable(s) x. Cross-sectional models capture correlation between y and x. Univariate time series models capture the correlation between y_t and lagged value(s)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Autocorrelation of a stationary process

A

Autocorrelations of a stationary process are defined as
ρ_s=γ_s/γ_0 ∈[-1,1]

We use autocorrelation over autocovariance y_s = Cov(yt, yt-s) as the autocovariance is scale dependent.

γ_0=Cov(y_t,y_t )=Var(y) is simply the variance. The autocorrelations describe the short-run dynamics within the time-series. In contrast, a trend describes the long-run dynamics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

AR(p) process - Definition

A

An AR process is an autoregressive process.
An AR(1) process models the current value as
y_t=μ+ϕy_(t-1)+ε_t,
where ε_t~”WhiteNoise” (0,σ^2 )

An AR process is a linear combination of the variable’s past value. ϕ (phi) determines the persistence. The closer ϕ is to zero, the more the AR(1) resembles a noise process. The closer ϕ is to one, the more it wanders off.

When |ϕ|<1, the AR(1) process is stationary.

Special cases are:
When ϕ=1, the AR(1) is a random walk
When ϕ=0, the process is called AR(0)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

MA(q) process - Definition

A

An MA(1) process is a moving average process which models the current value as a linear combination of lagged error terms, that is:
y_t = μ + θ*ε_(t-1)+ε_t, where ε_t~”WhiteNoise” (0,σ^2 )
= linear combination of the current and past white noise error terms.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Stationarity for AR (p)

A

Start from AR(1) and substitute in for y_(t+1). Infinite periods later:
y_(t+T) = ϕ^Ty_t + μ(1-ϕ^T)/(1-ϕ)+∑ϕ^I * ε_(t+T-i)

From this equation, we can determine the impact of ϕ on y_(t+T):
If ϕ>1, impact increases exponentially in T, explosive case. No constant mean, not stationary
If ϕ<-1, the absolute impact grows exponentially in T, but the sign oscillates from - to +. (not stationary)
If ϕ=1, the impact is permanent and independent of T, i.e., random walk (not stationary).
If |ϕ|<1, the impact decrases in T, i.e., weakly stationary case.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Stationarity for MA (q)

A

When ε_t is a white noise process with variance σ^2, we can show the following. The MA(q) process is stationary for all parameter values θ
(1) E(y_t )=μ
(2) Var(y_t )=γ_0=(1+θ_1^2+θ_2^2+⋯+θ_q^2 ) σ^2
(3) γ_s=Cov(y_t,y_(t-s) )

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

The World Decomposition Theorem

A

Some AR(p) processes can be expressed as MA(∞) process. As an example, we can restate an AR(1) process as a MA(∞) process. For this we apply a Taylor series expansion (expanding around zero)

This only works for stationary AR(1) processes. The same applies to AR(p) process .

Representing a MA(q) as an AR(∞) only works is |θ|<1 (this is not a stationarity condition). The condition under which MA(q) is inversible is that the roots of θ(z)=0 are outside the unit circle.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

ARMA(p,q) process

A

The ARMA(p, q) model combines both AR(p) and MA(q) models, making it suitable for a wider range of data types, including data that exhibits both autoregressive and moving average characteristics:

y_t=μ+∑ϕ_i y_(t-i) + ∑θ_i ε_(t-i) +ε_t, ε_t~”WhiteNoise” (0,σ_ε^2 )

The restrictions on the parameters ϕ_1,…,ϕ_p and θ_1,…,θ_q which define a stationary process are roughly the same as for an AR(p) process. Estimation usually requires Maximum Likelihood.

16
Q

Box Jenkins approach

A
  1. Graph the data
  2. Identification of lag structure
  3. Estimation
  4. Diagnostic checking:
    Information Criteria: AIC and BIC
    Diagnostics: Residual Diagnostics; Stability
  5. Use the model (e.g. in forecasting)
17
Q

PACF

A

PACF measures the correlation between observations at two points in time, accounting for the values of the observations at all shorter lags. It isolates the effect of the lag between those two points from the effects of all shorter lags.

Purpose: The PACF is used to identify the order of an autoregressive model in time series analysis. Specifically, it helps determine how many past points should directly affect the current value when all other influences are accounted for.

Examples why needed: The autocorrelation function is not enough to find the right time series model. If we suspect a DGP (data generating process) to be an AR process, testing for zero autocorrelation does not help to identify the AR process as autocorrelations die out slowly at higher lags. Further, if the process is an ARMA process with both MA and AR components, the autocorrelations’ signifi-cance wouldn’t tell us much. We must analyse partial autocorrelations

18
Q

ACF

A

The ACF measures the correlation between observations of a time series separated by various time lags. Essentially, it reflects how well the current value of the series is linearly related to its past values.

We standardise the autocovariance γ_s=Cov(y_t,y_(t-s) ) by the variance (γ_0=Cov(y_t,y_t )=Var(y_t )) to obtain the autocorrelations ρ_s=γ_s/γ_0 ∈[-1,1]

The autocorrelation function simply plots ρ(s) for various y_(t-s). We are interested in the autocorrelation function of stock returns because if ρ(s)’s are large, it indicates predictability of returns.

Purpose: It is used to identify the general pattern of serial correlation (e.g., cycling patterns not explained by the model) in the data, which can suggest types of models to fit, such as ARIMA models.

19
Q

ACF and PACF for model identification

A

AR(p): ACF decays slowly, PACF dies after lag p
MA(q): AFC dies after lag q, PACF decays slowly
ARMA(p, q): ACF and PACF both decays slowly.

20
Q

Testing for informational efficiency

A

We assume constant expected returns. Then (log) prices follow a random walk with constant drift (r ̅), that is p_(t+1)=r ̅+p_t+ε_(t+1).
For (log) returns we have r_(t+1)=r ̅+ε_(t+1). Since we want to test whether past information helps predict returns, we model
r_(t+1)=r ̅+b^’ Ω_t+ε_(t+1)
and test for b^’=0. A random walk has no transitory components, so returns are not predictable.

For now, we focus on Ω_t=r_t (i.e., time-series models):
r_(t+1)=a+br_t+ε_(t+1)=a+b(r ̅+ε_t )+ε_(t+1),where a=r ̅
b≠0 implies lagged returns forecast future returns, or equivalently Cov_t (ε_t,ε_(t+1))≠0, which violates informational efficiency when assuming constant returns. Because of the correlated increments (WN), there is not a RW in prices.

Notice that
b=Cov(r_(t+1),r_t )/Var(r_t )
As Var(r_t )=Var(r_(t+1) ) under constant expected returns, b is equal to the first-order autocorrelation coefficient (if we find autocorrelation, then returns are not random but predictable):

21
Q

Ljung-Box-Statistic

A

Testing whether all autocorrelations are jointly zero. If yes then returns are not random but predictable, no information efficiency.

Null hypothesis is that the data are independently distributed, implying that there are no correlations between successive observations (lags). This hypothesis states that the autocorrelations for all lag intervals within the specified range equal zero.

The alternative hypothesis is that there is at least one autocorrelation that is not zero, which suggests that the data exhibit serial correlation and are not random.

22
Q

Permanent vs. Transitory Shocks

A

If a shock has permanent (1:1) effect on prices, in econometrics, we say stock prices are non-stationary. Permanent shocks have long-lasting effects on the level of the variable they impact. Once a permanent shock occurs, the variable does not return to its original path but instead moves to a new equilibrium level.

Transitory shocks have temporary effects that fade over time, allowing the variable to return to its original path or trend. A transitory component in prices implies predictability of returns. If prices are mean reverting, we can predict them. Transitory shocks are stationary.

23
Q

Unit Root

A

A unit root in a time series indicates non-stationarity, meaning the series lacks a constant mean, variance, and covariance over time. The TS has a random walk component.

The presence of a unit root suggests that shocks to the series have permanent effects, causing persistent changes and making it unpredictable over the long run. Testing for a unit root helps determine if a time series requires differencing to achieve stationarity, critical for accurate forecasting and analysis.

24
Q

Difference between unit root, random walk and a stationarity process

A

Stationarity vs. Non-Stationarity: While stationary processes return to a constant mean and have a constant variance, a series with a unit root or undergoing a random walk does not. Their future values depend heavily on their past values, and their variance grows indefinitely over time.
Response to Shocks: In stationary processes, effects of shocks are temporary and diminish over time. In contrast, in unit root processes and random walks, shocks have permanent effects that alter the series’ future path.
Modeling Implications: Detecting a unit root is crucial before modeling because traditional regression techniques assume stationarity. Non-stationary data, like that with a unit root or in a random walk, often requires differencing or other transformations to make the series stationary and suitable for analysis.

25
Q

Dickey Fuller Test

A

Primary objective of the Dickey-Fuller test is to test for the presence of a unit root in a time series. It tests the null hypothesis that a unit root is present against the alternative hypothesis that the time series is stationary (or trend-stationary).

Problems when testing for a unit root: The augmented Dickey-Fuller test is criticized to have little power to distinguish between the nonstationary case (ϕ=1) and a near nonstationary case (e.g., ϕ=0.95). This criticism seems misplaced, since no test is that accurate. Remember Dickey-Fuller tests for a unit root, it is not a stationarity test.

26
Q

KPSS test

A

Unlike the Dickey-Fuller test, the KPSS test checks for the presence of stationarity in the series. The null hypothesis of the KPSS test is that the series is stationary around a deterministic trend (or level stationary if no trend is included in the test).

27
Q

Testing for long run autocorrelation

A

(1) Positive feedback trading makes prices overreact to initially relevant information (funda-mentals). This means prices are more volatile than justified by fundamentals, so excess vol-atility in markets with positive feedback traders.
(2) Informed traders correct mispricing (induced by positive feedback trading) by trading in the opposite direction. Short-term positive serial correlation in prices, long-term negative serial correlation (mean reversion) in prices.
(3) Positive feedback traders overreact to important firm events (e.g., changes in dividends). So there is a relation between return and the P/D ratio (or dividend growth).

“Informed traders correct misplacing (induced by positive feedback trading) by trading in the opposite direction”

–> Short term positive serial correlation (due to positive feedback trading), long term negative serial correlation (mean reversion) in prices (due to price corrections through smart money).
We can test for negative autocorrelation using the Fama-French regression.

28
Q

ARMA in Disturbance

A

An ARMA(p, q) model in the disturbances modifies the basic linear regression model to allow for autocorrelation in the error terms. The key here is that ϵt, rather than being plain white noise, follows an ARMA process.

29
Q

AIC Information Criteria

A

Not consistent but more efficient

AIC is a widely used measure of a statistical model’s quality. It deals with the trade-off between the goodness of fit of the model and the complexity of the model.

It is defined as: AIC=2k−2ln(L)
where: k is the number of parameters in the model. L is the likelihood of the model, which measures how well the model fits the data, derived from max likelihood estimation

30
Q

BIC Information Criteria

A

Strongly consistent but inefficient

BIC is similar to AIC but includes a stronger penalty for models with more parameters. This makes BIC more stringent about complexity and can be particularly effective in larger datasets.
It is defined as: BIC=ln(n)k−2ln(L)
where: n is the number of observations. k is the number of parameters in the model. L is the likelihood of the model, which measures how well the model fits the data.

Will asymptotically deliver correct model order
Varies more in its recommendation

31
Q

Which Model to chose based on AIC and BIC?

A

Select the model with the lowest AIC and BIC values. They indicate a better balance between model fit and complexity, thus reducing the risk of overfitting while effectively capturing the underlying data pattern.

AIC and BIC can suggest different models, especially if the sample size is large. In such cases, preference might be given to BIC for its stricter penalty on complexity, thereby reducing the risk of overfitting.

32
Q

Static Forecast

A

Static Forecasts imply a sequence of one-step ahead forecasts where new realisations are included in the forecasts

This approach generates forecasts for each future time point, one at a time, always using the actual observed data up to that point to make each forecast.
**For example, to predict Yt+1, you use the actual data up to time t. Then, to predict Yt+2, you again use the actual data up to time 𝑡+1 (including the actual Yt+1, if it’s available).

33
Q

Dynamic Forecast

A

Dynamic Forecasts calculate multi-step forecasts starting from the last period in the prediction sample

Instead of using new actual data points as they become available, the forecast for Yt+1 is used as part of the data to forecast Yt+2, and so on. The initial forecast uses actual data up to time t, but all subsequent forecasts are based purely on outputs of the model.

34
Q

Dynamic vs. Static Forecast

A

Static Forecasting: More aligned with scenarios where new real data is continuously fed into the forecasts. It is inherently more accurate for the immediate next steps because it uses the actual data as it becomes available, which corrects for any previous forecasting errors

Dynamic Forecasting: Useful for longer-term forecasting where the feedback from actual data cannot be incorporated immediately. It is particularly beneficial for strategic planning and analysis over an extended horizon.

35
Q

Stage 4 Diagnostics - Residual Diagnostics

A

Residual diagnostics involve analyzing the residuals — the differences between the observed values and the values predicted by the model — to check if they behave like white noise.

Ljung-Box Test: A statistical test that checks the overall randomness based on a number of lags. This test determines if the group of autocorrelations of the residual series is different from zero.

Autocorrelation Check: Linear dependency (autocorrelation) in the residuals (meaning they are not white noise) would suggest that the chosen ARMA model is inadequate.
Utilizing the Autocorrelation Function (ACF) and the Partial Autocorrelation Function (PACF) of the residuals to ensure there are no significant correlations at various lags.

36
Q

Stage 4 Diagnostics - Stability

A

Stability diagnostics check the constancy of model parameters over time and the stability of the underlying time series model itself.

The stationarity of an ARMA process depends on the AR(p)’s parameters Φ.
The invertibility of an ARMA process depends on the MA(q)’s parameters θ.

The AR(p) process is weakly stationary/ MA(q) process is invertible, if the roots of
the characteristic equation (also called “lag polynomial”) lie outside the unit circle .
(Or equivalently: If the inverse roots of the AR polynomial all lie inside the unit
circle.)

When not the model is not stable and you might prefer a lower order, stable model instead.