Speech Recognition Flashcards
what are acoustic phonetics?
study of the physical properties of speech
what is sound?
a vibration that propagates as an acoustic wave
(based on the perception of its characteristics)
what is frequency?
the number of times per second a sound wave cycles from the highest to the lowest point.
what is amplitude?
height of the wave
taller the wave = louder the sound
what is a sound spectrogram?
is a visual representation of the spectrum of frequencies of sound
axis’s of sound spectrogram?
frequency of sound on vertical axis, time on horizontal axis, intensity shown by darkness
sound spectrogram formats?
dark bands (i.e., most intensity)
▪ Steady state formant (stays same over time)
▪ Formant transitions (changes over time)
problems posed for speech recognition?
- lack of invariance
- problem in speaker variability
- segmentation problem
what is lack of invariance?
no one-to-one correspondence between speech cues and perception
what is the problem in speaker variability?
People differ in production of speech sounds –across people and occasions
what is a segmentation problem?
people typically do not leave breaks between words when speaking.
what is categorical perception?
We do not discriminate sounds within a phonemic category
- ex: we classify speech sounds as one phoneme or another
modularity (revisited) and categorical perception?
- Some people have taken categorical perception as evidence for a speech perception module
- chinchillas show categorical perception
what are some speech segment strategies?
- possible word constraint: tendency to segment speech so that each segment is a possible word
- Bilingual speakers tend to use strategies that are consistent with their dominant language
what does context and speech recognition involve? when are people better at identifying words?
- people are better at identifying words when presented in sentences than when presented in isolation
- speech recognition involves bottom-up and top-down processing