Part 1 : Data Acquisition and Characteristics Flashcards
Analogue to Digital conversion involves
Sampling and Quantisation.
Sampling
Ascertain the momentary value of (an analogue signal) many times a second so as to convert the signal to digital form.
Quantisation
The process of mapping a large set of input values to a (countable) smaller set.
Nyquist Shannon Sampling Theorem
If a function x(t) contains no frequencies higher than B hertz, it is completely determined by giving its ordinates at a series of points spaced 1/2B seconds apart.
Nyquist Shannon Sampling Theorem (Laymans)
If the highest frequency in the signal is f(max) the sampling rate must be at least 2f(max).
Valid distance measure D(a,b) has properties
- Non-negative
- Reflexive
- Symmetric
- Satisfies Triangular Inequality
Minowski Distant or order p (p-norm distance) is defined as
D(x,y) = (Σ|x(i) - y(i)|^p)^(1/p)
When p=1, 1-norm distance, Minowski
(aka Manhattan)
D(x,y) = Σ|x(i) - y(i)|
When p=2, 2-norm distance, Minowksi
(aka Euclidean)
D(x,y) = ((x-y)^T(x-y))^(1/2)
When p=∞, ∞-norm distance, Minowski
(aka Chebyshev)
D(x,y) = max(|x1-y1|, |x2-y2|,…,|xn-yn|)
Time series
Successive measurements made over a time interval
(Numerical Time Series), P-norm Distances can only
- Compare time series of the same length
- Very Sensitive respect to signal transformations
- Shifting
- Uniform Amplitude Scaling
- Non-Uniform Amplitude Scaling
- Uniform Time Scaling
Dynamic Time Warping (Berndt and Clifford, 1994)
- Replaces Euclidean one-to-one with many-to-one
- Recognises similar shapes, even in the presence of shifting and/or scaling
- X = (x0,…,xn) and Y = (y0,…,yn) and Rest(X) = (x1,…,xn)
- DTW(X,Y) = D(x0,y0) + min{DTW(x, REST(Y)), DTW(REST(X),Y), DTW (REST(X), REST(Y)))}
- Solved Efficiently using dynamic programming by building an nxm distance matrix
(Distance Symbolic) in text could be
- Syntactic
- Semantic
Syntactic
- Defined over symbolic data of the same length
- Measures the number of substitutions required to change one string/number into another
Syntactic e.g. Hamming Distance
Returns the number of mismatches, max = length
Syntactic e.g. Edit Distance
Measures the minimum number of ‘operations’ required to transform one sequence to another
Syntactic e.g. Edit Distance - Operations
- Insertion
- Substitutions
- Deletion