signal processing - final exam Flashcards
power spectra
shows individual freqs & their amps
used to identify harmonics or component freqs in sound
spectrograms
visual representation of the spectrum of freqs in a sound signal as it varies w/ time
wideband spectrograms
good for viewing formants
good temporal resolution but less freq detail
narrowband spectrogram
good for viewing harmonics
detailed freq info but less temporal precision
Nyquist frequency
highest freq that can be captured w/ a given sampliing rate
1/2 the sampling rate
fast Fourier transform (FFT)
calculates the spectrum of one window of a sound wave
finds the freqs & amps of the simple waves that make up the complex wave
short vs long windows FFT
short - lowest freq you can get is high because it’s more condensed
wideband
long - can fit a lower freq in the window & therefore lowest is lower
narrowband
root mean square (RMS)
used for measuring intensity - when there’s negative & positive so they don’t just cancel each other out & = 0
- square each sample in the window
- take the mean
- take the square root of the mean (to undo squaring)
RMS tracking
over time, in each window
as window size increases –> amp trace becomes smoother
BUT temporal accuracy reduced = lose ability to track sudden changes
autocorrelation
used to measure pitch
- pick some interval of the speech signal
- copy it
- shift the copy over sample by sample to see how well it correlates w/ the og wave
- after some predetermined # of shifting, stop & see which correlation is best
best autocorrelation
always lag 0 - where the wave starts over
choose the next best correlation at a lag of 1 cycle = pitch period
pitch doubling
when 1 cycle of the waveform has 2 halves that look roughly the same
or if longest lag <1 full pitch period
period incorrectly estimated as half the real period
so pitch is double the real pitch
pitch halving
when the shortest lag is after the end of the 1st pitch period (too long)
you skip the lag 0 (good) but also skip a lag of 1 pitch period (bad)
next best correlation is at a lag of 2 pitch periods
linear predictive coding (LPC)
smooths out the higher frequencies
separates the source from the filter (factors out harmonics, leaving only resonances)
algorithms tries to find a filter represented by a set of numbers that best describes how the harmonics were filtered
FFT vs LPC
FFTs show harmonics, but don’t tell you specifically about formants
LPCs show resonances but not individual harmonics