Learning with time series Flashcards
Problem definition
Given x(1), x(2), …, x(N) generated by a SSP x(t,s) Compute - the mean - the covariance function - the spectral density
def correctness of a sample estimator
A sample estimator Q^N of a quantity Q is correct if
E[Q^N] = Q
def consistency of a sample estimator
A sample estimator Q^N of a quantity Q is consistent if
E[(Q^N - E[Q^N])^2] - > 0 as N - > +inf
Sample mean estimator
Given x(1), x(2), …, x(N) generated by a SSP x(t,s) the mean m = E[x(t,s)] is estimated as m^N = 1/N * sum(t=1,N) x(t)
This estimator is correct, while its consistency depends on the process
Sample covariance function estimator
Given x(1), x(2), …, x(N) generated by a SSP x(t,s) with zero mean, the covariance function
γ(τ) = E[x(t) * x(t-τ)] , |τ| < +inf
is estimated as
γ^N(τ) = 1/(N-τ) * sum(t=1, N-τ) x(t) * x(t+τ) , 0 < = τ < = N-1
or
γ^N’(τ) = 1/(N-|τ|) * sum(t=1, N-|τ|) x(t) * x(t+τ) , |τ| < = N-1
This estimator is correct and consistent
Sample spectral density
Given x(1), x(2), …, x(N) generated by a SSP x(t,s) with zero mean, the spectrum
Γ(ω)
is estimated as
Γ^N(ω) = sum(τ=-(N-1), (N-1)) γ^N(τ) * e^-jωτ
where two approximations are present, the first given by the sum that does not contain infinite terms, the second given by the estimate of the covariance function.
This estimator is asymptotically correct
Problems of the sample spectral density
1) It is not consistent - > high variance
2) high computational effort, since the sample covariance function need to be computed first
Solution to the computational problem
Computation via FFT
Γ^N’(ω) = sum(τ=-(N-1), (N-1)) γ^N’‘(τ) * e^-jωτ
where
γ^N’‘(τ) = 1/N * sum(t=1, N-|τ|) x(t) * x(t+τ) , |τ| < = N-1
- > it can be proven that
Γ^N’(ω) = 1/N * | sum(t=1,N) x(t) * e^-jωt |^2
that is the discrete FT of x(t), that can be computed efficiently by the FFT algorithm.
Solution to the consistency problem
- Data-set is divided into 4 parts
- Γ^N/4(i) is the estimator of the spectrum using the i-th part of the data, i=1,2,3,4
- Γ^^N(ω) = 1/4 * sum(i=1,4) Γ^N/4(i)(ω)
- it can be proven that the variance of this estimator is reduced by a factor of 4
Data pre-processing for time series: use
- The estimators are valid if data are taken from a SSP
- In case of trend and seasonality in the data, the process is not stationary
In these cases:
- estimate of trend and seasonality v^(t)
- removal of trend and seasonality
x^(t)ssp = x(t) - v^(t)
- estimations using the stationary SP
- add of trend and seasonality to the estimated mean
Assumption: x(t) = x(t)ssp + v(t) where v(t) is a deterministic signal
Trend estimation and removal
- Linear trend: v(t) = k*t + q
k^ and q^ estimate of a linear regression problem - > x^(t)ssp = x(t) - k^*t - q^
- Same approach can be extended to polynomial trends, considering:
v(t) = kt + q + at^2 + b*t^3 + …
Seasonality estimation and removal
x(t) = x(t)ssp + s(t)
where
s(t) = s(t+kT), with k app Z and T known period of the seasonality
s^(t) = 1/M * sum(h=0,M-1) x(t+hT)
with t = 1,2,…,T
and MT < = N
- > x^(t)ssp = x(t) - s^(t)