lecture 8 - time series data Flashcards
what is time series data?
x(t)
anything attached to a time stamp e.g. months, years, seconds
when studying time series data we are interested in?
- any changes in a specific variable over time
- obvious oscillations
- obvious trends
data and patterns can be considered as:
- random
- clustered
- cyclic
- chaotic
what are the different components of a time series?
- trend component
- periodic component
- random noise component
what is interpolation?
the estimation of an unknown quantity between two known quantities
types of interpoaltion
- linear
- nearest neighbour
- cubic
steps of linear interpolation
1 - identify the gap
2 - subtract the difference between the values before and after the gap
3 - divide this by the number of missing values +1
4 - either add the answer to the data point (if ascending)
OR subtract (if descending)
what is signal processing?
the alteration of a time series to extract specific components to reduce noise
define noise
the amount of unexplained variation in a sample
reducing noise is also known as?
smoothing
methods of smoothing
- moving average
- Savitzky-Golay filtering
root-mean-square
removes negatives
used to determine strength of a signal
define frequency
number of cycles or data points within a period of time (generally 1 second)
what is a wavelength?
time and distance between 2 identical points e.g. peak to peak
what is amplitude?
height of the wave (from zero)
also known as power or force
what is phase?
refers to the difference between troughs and peaks in amplitude
what does “in phase mean”?
2 datasets are identical - matching peaks and troughs
what does anti-phase mean?
the peak in one signal is directly above the trough in another
Savitzky-Golay filtering is better than moving average for:
- preserving the periodic component
- where there is a lot of high frequency noise on low frequency signals
Savitzky-Golay filetring uses?
polynomial least squares
what does polynomial mean?
equation with multiple terms
polynomial least squares performs
a curved fit - not linear
time series regression
y = a + bx + u *x = the timestamp
how do you assess normality?
use a Q-Q plot (Quantile-Quantile plot)
compares the quantiles of a variable against theoretical quantiles of one with a normal distribution
what is a quantile?
subdivision of a dataset whereby an expected number of values would be expected
sample autocorrelation:
detects oscillations
assumes that the periodic component is stable
correlates the series against itself and an identical dataset which is shifted by a range of lags
cross-correlation determines?
links between offset data
the aim is to remove the noise component to reveal?
- trend component
- periodic component
why do we smooth data?
- to reduce noise
- to find the trend component
- to find the periodic component