Speech Recognition Flashcards
Segmentation is one problem listeners with a speech signal have. What is it?
Dividing the speech input into phonemes (units of sounds) and words
What is coarticulation?
Speech signal is variable because pronunciation of a phoneme depends on speaker’s pronunciation of preceding and following phonemes
What are speech signals due to?
Differences in speaker’s sex, dialect and speaking rate
How phonemes do speakers typically produce per second?
Around 10 - listeners must identify what is being said very rapidly. Non-native speakers often produce many speech errors
What is energetic masking?
Speech signal is hard to perceive due to distracting sounds e.g. other speakers
What are the 3 cue categories mattys et al identified in the hierarchical approach to segmentation?
- Lexical, syntax, word knowledge -> used if listening conditions are good
- Segmental -> coarticulation
- Metrical prosody, word stress -> used only if other cues cannot be used
What is speaker’s variability?
Listeners use speaker characteristics e.g. American accent to form a speaker model. This speaker model then influence how listeners interpret speech signal
add in info from graph