Lecture 6 Flashcards
Describe main steps in text - speech synthesis
- Text to phoneme conversion: converting text into strings of phoneme symbols
- Speech Synthesis: phoneme symbols computed from text in previous stage are used to compute digital waveform. Sound wave info undergoes digital to analogue conversation, converting digital waveform into electrical current - will generate sound from loud speaker
Compare formant speech synthesis (FSS) with concatenation speech synthesis (CSS)
FSS; 2 Tyes of phoneme sound source excitation are used- Voice pulses for phonation and noise for aperiodic fricative sounds.
CSS; Involves storing, concatenating (linking together), smoothing section of prerecorded voices
Identify main problems faced by automatic recognition systems
Segmenting speech input into words and phonemes & Variability in the speech signal associated with coarticulation, talker differences, rate, dialect, sentence prosody, background nose level.
Describe main steps in automatic speech recognition
- A-to-D conversion converts voltage levels of the analogue signal picked up by the microphone into digital values
- Acoustic processing (codes digital signal and provides the spectral pattern)
- Phonetic features are then extracted
Main ways speech recognition systems may be categorised
- Discrete (speaker must pause between words) vs continuous (no gaps necessary)
- Vocabulary size: Small (200 words)- often specific tasks, large (1,000)-, v large (30,000)- useful for dictation purposes.
- Speaker dependent systems- genre of the person using the software to be set., and for the software to be trained on or adapted to the users voice.
Identify two distinct clinical applications of speech synthesis and identify factors associated with the technology that need to be considered when they are clinically applied
- Reading instructions- potential to help children with reading difficulties (hearing the text whilst looking at words has a positive effect on reading ability)
- Communication aids- can be a means of oral communication for those who cannot
Identify two distinct clinical applications of speech recognition, and identify factors associated with the technology that need to be considered when they are clinically applied
- Dyslexia- narratives can be produced on oral basis- cognitive resources can be applied to meaning and expressing themselves rather than concentrating on spelling and writing letters.
- People with physical disabilities that affect standard keyboard control
Advantages/ Disadvantages of formant speech synthesis
Advantages: relatively high efficiency, computational demands are not excessive. Achieves high fast output rates.
Disadvantages: speech sounds unnatural and robotic and expensive
Advantages/ Disadvantages of concatenation speech synthesis
Advantages: tends to sound more natural and intelligible, shorter development time and in turn relatively less expensive
Disadvantages: expensive in terms of memory storage, limited variation in voice quality