Speech Acoustic Measurement & Analysis Flashcards
Sound Spectrograph
Developed during WWII during the process of trying seeking methods to encode and decode messages
Able to display formants as a continuous function of time
Spectrum Analysis
The spectrum analyzer performs its analysis
The spectrum analyzer performs its analysis by moving a fixed-width analysis band, or filter, across the entire frequency range.
▪
If the analysis band is swept continuously across the entire frequency range, it will provide overall voltage outputs as a continuous
Schematic diagram showing how the spectrograph
Schematic diagram showing how the spectrograph performs spectral analysis by sweeping an analysis band of fixed width (e.g., 300 Hz) across the frequency range of interest, and recording the average voltages from the analysis band as a continuous
The Original Sound Spectrograph
The invention of the spectrograph allowed for the study
The invention of the spectrograph allowed for the study of the time-varying acoustic results of articulatory processes
▪ If articulator movements are changing as a function of time, and therefore changing vocal tract configuration as a function of time, the changes are reflected in formant transitions—formant frequencies that change over time
The Original Sound Spectrograph
The time-varying patterns of electromagnetic strength are
The time-varying patterns of electromagnetic strength are submitted to a spectrum analyzer in the form of time- varying voltages, where voltage is proportional to sound intensity (greater voltage = greater intensity) and the speed with which the voltage changes is proportional to frequency (faster voltage changes [shorter periods] = higher
The Original Sound Spectrograph
The energy in the spectrum is sampled using an analysis band
The energy in the spectrum is sampled using an analysis band, or filter, that has a bandwidth of 300 Hz that is swept continuously across the entire frequency range of interest.
Because the voltage output from the analysis band is available for all frequencies and at every point in time, the spectrograph creates a total picture of the
Digital Spectrograms
Produced almost instantaneously after
Produced almost instantaneously after an utterance has been recorded.
▪ The principles previously discussed for the original spectrograph are basically the same in digital spectrograms; the frequency and amplitude analysis are performed by moving a digital filter from low to high frequencies
▪
The output is a digital magnitude that is proportional to amplitude as a function of frequency.
Interpretation of Spectrograms
Important features
Important features of the spectrographic display include:
the x-, y-, and z-axes
glottal pulses
formant frequencies
silent intervals
stop bursts aperiodic intervals.
The spectrogram shows a series of chunks, or segments, as the pattern is inspected from left to right.
The chunks, or segments, are important because they often correspond roughly to speech sounds.
Axes
X-axis: time
Y-axis: frequency
Z-axis: intensity (third dimension of the spectrogram)
coded by the darkness of the pattern at any point on the spectrographic display
Glottal Pulses
Dark vertical lines that are
Dark vertical lines that are an acoustic result of vocal fold vibration
Each individual line reflects a single glottal pulse–a point of excitation, when the vocal folds close quickly at the end of a glottal cycle and create a pressure wave whose spectrum is shaped by the vocal tract filter.
Formant Frequencies
The dark bands seen in the patterns
The dark bands seen in the patterns with regularly spaced glottal pulses This pattern is seen for any speech sound produced with a relatively open vocal tract and voicing, including vowels, diphthongs, and semivowels (/l/, /w/, /ɹ/, /j/)
▪
Nasals are voiced and radiate sound
Formant Frequencies
The formant frequencies change as
The formant frequencies change as a function of time in connected speech. The constant movement of the formants during speech production reflects the constant change in the configuration of the vocal tract.
Silent Intervals and Stop Bursts
Appears as a
Appears as a brief blank spot, or gap on
the spectrogram
▫ Because the vocal tract is, in theory,
completely sealed during the closure, acoustic energy should not be radiated from the vocal tract
▪
If intensity is scaled on a spectrogram as the darkness of the trace, a white or nearly white segment indicates a
Silent Intervals and Stop Bursts
For voiced stops, there is a small amount of
For voiced stops, there is a small amount of periodic energy on the baseline due to vocal fold vibration during vocal tract closure
▪ Vocal fold vibration during a closure interval causes vibration of the walls of the vocal tract,
▪ This energy is seen only in the lowest freqs of the spectrogram because walls of the vocal tract vibrate only at the lowest freqs of vocal fold vibration, filtering out the higher source harmonics
Pauses vs Stop Closure
Pauses in speech are typically at least 150 ms and voiceless stop closure intervals are typically less than 120 ms.
▪ Many scientists have adopted a criterion of 200 ms–Silent intervals 200 ms or greater are identified as pauses, those less than 200 ms are subject to further evaluation
▫ This criterion is less reliable in certain speech disorders