Speech Production Flashcards
What is articulation?
A change in vocal tract shape. When you produce a sound, position tongue, lips, jaws differently so the vocal tract produces different sounds
What does modulation of the oropharynx change?
Formant frequency patterns
What is speech?
A series of syllables - building blocks of words
What are syllables
Consonant plus vowel. speech = CVCVCVCV
What are vowels characterised by?
an open configuration of the vocal tract so that there is no build up of air pressure above the glottis
What are consonants characterised by?
constriction or closure at one or more points along the vocal tract
What does the human supra laryngeal vocal tract consist of?
Alveolar ridge Tongue lip, blade and body Larynx Hard palate Soft palate Uvula Pharynx Tongue root Epiglottis - closes the larynx when you swallow
How to perceive vowels?
Different vowels have different distributions of energy among the formants - different peaks excite different areas in the basilar membrane
Why is the ear important?
It is a sound analyser which can detect formant frequencies by the different amounts of excitation at different places along the basilar membrane
Why are consonants important?
The place and the manner of articulation - where and how they are produced
Place of articulation
Bilabial - lips Labio-dental - lips and teeth Alveolar - tongue and alveolar ridge Palatal - tongue and hard palate Velar - tongue and velum
Manner of articulation
Different vibrations occur at different points, producing different consonants
stop: involve interruption of air flow and vibration inside the larynx
nasal - involve vibration inside nasal cavity
fricative, approximate and affricate - involve secondary vibration in contact with articulators
What are the types of consonants?
Voiced or voiceless
voiceless - long interruption of vibration (t, p, k, sound different)
voiced - continuous (sound the same)
Is speech a series of discrete syllables?
No, we can’t segment speech
we can’t cut and paste the phonetic elements of speech to make words and sentences
Frequency resolution problem
Speech contains 20 to 30 meaningful sound segments/second
but we can only identify 7-9 segments/second and we hear a tone above 20 segments/second
Why is speech a continuous phenomenon?
There are no gaps between words
Can’t see where the words are stopped and starting - as we speak, there is a smooth change from one speech sound to the next
What causes us to hear the consonants w and g?
The formant transitions = changing formant pattern at the start of each syllable
What is co-articulation?
The articulation of two or more speech sounds together, so that one influences the other
not one thing responsible for one letter
the articulatory gestures characteristic of each isolated sound are never attained in isolation but melded together in a composite characteristic of the syllable
Different transitions - same consonant
Different transitions code d in front of i or a:
dee or da - you can’t paste the formant transition of the d of di with a to generate da
Same noise - different consonant
The same noise codes for p in front of i but for k in front of a
Why does speech not have invariant acoustic targets?
The acoustic realisation of the consonant changes with the vowel, even only producing the s, you know what is following it because of how you have articulated it
vowel is encoded inside the consonant - this is due to co-articulation
Advantages of coarticulation
Information about different segments is spread across time
you know what is coming next because of the type of consonant you have already produce
spreading information across time makes it easier to transmit information at a fast rate
Disadvantages of coarticulation
For perception or machine recognition, there are no constant acoustic targets in speech:
the same consonant can be represented as different sounds in different contexts
the same sound can be heard as different consonants in different contexts
lack of invariance