03 - Speech Signal Representations Flashcards
What is the ‘Zero-Crossing Rate’?
It is the rate at which a signal crosses the horizontal (x) axis or the zero level.
When is it useful to check the Zero-Crossing Rate?
The Zero-Crossing Rate can be useful to check in many scenarios. It contains information such as the frequency content and overall changes in the signal. We often check it during speech analysis.
What does the ‘Autocorrelation’ measure?
The similarity of a signal compared to its time-delayed self.
What does the term ‘Windowing’ refer to?
Answer from research:
Windowing is the process of multiplying a frame with a function so that the signals characteristics are modified, so that the amplitude of the signal is gradually reduced towards the edge of the window. It is often used on frames to achieve a quasi-stationary analysis.
From the slides (I partially disagree with this):
It is the process of splitting the input signal into temporal segments where the signal can be considered quasi-stationary. That is, not really stationary, but in the essence of the analysis, they are.
What does the term ‘Frame’ refer to?
It is a continuous segment of a signal that is isolated for analysis and processing.
What is the Overlap-add algorithm?
It is an algorithm that describes the process of recombining overlapping frames of a signal after some processing.
What are the steps of the Overlap-add algorithm (think LAB 2)?
- Input signal is windowed into frames of a certain frame length with a certain interval (hop length)
- We compute the output from the input frame
- We apply a window to the output frame (for example Hamming)
- We recombine the overlapping output frames
Why do we use the Overlap-add algorithm?
It is an efficient computation of the convolution of two signals by breaking them into smaller segments, convolving each segment separately, and then combining the results.
What is the Short-Time Fourier Transform?
The short-time Fourier transform is a type of Fourier analysis used to determine the frequency content of a signal over short, fixed-length time intervals.
What is a ‘Spectrogram’?
A spectrogram is a visual representation of the frequency content of a signal over time. It is a plot that displays the intensity of different frequencies of a signal as they change over time, typically represented as a heat map or color map. The x-axis is time and the y-axis is frequency (Hz)
What is a Source-Filter Model?
In speech signal processing, the source filter model is a mathematical model that represents the speech signal by a combination of a sound source with a linear acoustic filter.
What is a ‘Second Order All-Pole Filter’?
It is a system that has two poles and no zeros in the transfer function.
What is ‘Linear Prediction’?
Linear prediction is a method that tries to predict a signal sample using a linear combination of the signal’s past samples.
What is the ‘residue’ of Linear Prediction?
It is the prediction error, sometimes referred to as the residuals in other models.
What does Linear Prediction Optimization refer to?
The minimization of the energy of the prediction error.