Part 1: Acoustic Phonetics Data Flashcards
Used by many disciplines
Linguists
Speech communication specialists–speech synthesis & recognition systems
Speech production scientists
Speech perception scientists
Speech-language pathologists
V=vowel
/h/ has minimal
əhVd” frame is commonly used
V=vowel
/h/ has minimal influence on the vocal tract gestures required for a following vowel
/d/ provides a “natural” ending to the syllable
Accommodates the production of lax vowels such as /ɪ/, /ε/, and /ʊ/
Vowel Formants
a single vowel can be associated
a single vowel can be associated with a wide range of F1-F2 values depending on resonance patterns of tubes of different lengths, and age- and sex-related differences in vocal tract length
In cases of overlapping formant freqs, identity of the speaker, including age, sex, and dialect, allows listeners to link a specific formant pattern to a vowel category
Vowel Quadrilaterals for men, women, and children
quadrilVowel aterals for men, women, and children move from the lower left to upper right part of the graph, respectively.
Vocal tract becomes progressively shorter
Vowel space appears to be larger for children, compared with men and larger for women compared with men
the vowel quadrilateral for one group of speakers cannot be perfectly fit to the quadrilateral for a different group by moving it to the new location
Vowel Spaces
Acoustic vowel space for corner vowels has
Acoustic vowel space for corner vowels has clinical application as an index of speech motor integrity
E.g., smaller in persons with dysarthria
Size of the acoustic vowel space is correlated with speech intelligibility or perceptual measures of articulatory precision
Can be made to expand and contract with different speaking styles
Interpretations of the correlation between Vowel Spaces & Speech Intelligibility
Size (area) of the vowel space may be
Size (area) of the vowel space may be an index of articulatory mobility and speech motor control
Larger vowel space areas increase the acoustic difference between closely related vowels, such as /i/ versus /ɪ/ or /u/ versus /ʊ/
independent component of a speech intelligibility deficit (separate from motor control)
Branches of Comparative Acoustic Phonetics
Acoustic characteristics of similar speech sounds in two or more
Acoustic characteristics of similar speech sounds in two or more languages or in two or more dialects of the same language.
The effect of native language (or dialect) phonetics on the acoustic characteristics of speech sounds in a second language
Acoustic characteristics of vowels have been a major focus for both branches
Within-speaker variability in formant frequencies
vowel formant frequencies vary with a number of factors
vowel formant frequencies vary with a number of factors
speaking rate
syllable stress
speaking style
phonetic context
Articulatory Undershoot
Lindblom (1963) coined the term to describe
Lindblom (1963) coined the term to describe vowel production in connected speech that was not the most extreme configuration associated with the sound
the shorter the vowel duration, the greater the undershoot
increased speaking rate, reduced stress, and a casual speaking style are all associated with shorter vowel durations
Summary of Vowel Formant Frequencies
Sex and age impact
Sex and age impact the formant frequencies of vowels because they are related to differences in vocal tract size and length.
In general, the longer and larger the human vocal tract, the lower the formant frequencies for all vowels
explains the large range of formant frequencies across the population
Even when vocal tract length/size factors are held
Even when vocal tract length/size factors are held constant, “target” formant frequencies for a given vowel may vary for several reasons
Dialect, vocal tract length, phonetic context, syllable stress, speaking rate, speaking style
Vowel Durations
Studied extensively due to application to:
Studied extensively due to application to:
speech synthesis
machine recognition of speech
description and possibly diagnosis of certain speech disorders
Extrinsic Factors Affecting Vowel Durations
Consonant voicing
Vowels are typically longer when surrounded by voiced consonants
Stress
Vowels in emphasized syllables have greater duration
Speaking rate
Slower rates=longer vowel durations and vice versa
Extrinsic Factors Affecting Vowel Durations
Utterance position
Utterance position
Phrase-final or utterance-final lengthening
Speaking style
E.g., “Clear speech”
Diphthongs
The six diphthongs in American English include:
/ɑɪ/ (“guys”)
/ↄɪ/ (“boys”)
/ɑʊ/ (“doubt”)
/eɪ/ (“bays”)
/oʊ/ (“goes”)
*/ju/ (“beauty”)
*not considered a diphthong in many phonetics textbooks, but has properties similar to the other diphthongs.
Diphthongs: Two connected vowels or a unique phoneme?
absence of steady states in diphthongs is a potential
absence of steady states in diphthongs is a potential complication in classifying diphthongs as a sequence of two vowels
each of the diphthongs has an identifiable transitional segment (as seen on spectrograms
Diphthong Duration
Diphthongs are generally longer than monophthong vowels in equivalent environments and speaking conditions
Nasal articulations are described
Nasal articulations are described acoustically in two categories
Nasal murmur
Nasalization
Three factors allow human listeners or statistical classification to
Three factors allow human listeners or statistical classification to allow fairly accurate identification
murmur offset
vowel onset (murmur + transition piece)
transition piece
Nasalization
Involves complex acoustics resulting from
Involves complex acoustics resulting from the mix of oral and nasal tract formants with antiresonances originating in the sinus cavities
Fourier spectrum of vowels
A1—P1 relative amplitude difference
Nasalance
Measure of the acoustic energy radiating from the nares
Nasometry values are often computed for extended passages.
One with no nasal consonants
One loaded with nasals to elicit high nasalance values
May use a passage with a mix of obstruents and nasals is estimate nasalance for the phonetics of typical utterances.
Advantages of Nasometry
speed of obtaining a value
automatic nature of the measurement
tendency for perceptual estimates of nasality to increase as nasalance increases
large number of published articles on nasalance values in various populations
Disadvantages of Nasometry
global nature of the measure
usually averaged across entire passages and yields a single number per passage
tendency for perceptual estimates of nasality to vary imperfectly with nasalance values
difficulty of knowing how variation in nasalance values relates to the specific nature of velopharyngeal dysfunction
Semivowels
/w/, /ɹ/, /l/, and /j/
require movement to and away from a vocal tract constriction tighter than that for vowels, but not as much as for obstruents
All are produced with a vocal tract open to the atmosphere, and are considered vocalics.
/w/ and /j/ are also referred to as glides, /ɹ/ and /l/ as liquids
constriction interval and transition acoustics provide the acoustic information necessary to distinguish among these sounds.
Constriction Interval
interval of relatively “flat” formants
assumed to correspond to the part of semivowel articulation when the vocal tract is most constricted
formant pattern like those of vowels
Formant Transitions
pattern of formant transitions into and out of the constriction intervals also distinguishes among the semivowels
Important characteristics (see 11-9)
the specific formants that have large transitions into and out of the constriction interval
the direction (rising versus falling) of the transitions
Semivowel Acoustics and Speech Development
semivowel errors are
semivowel errors are frequent during phonological development and in speech delay
E.g., /w/ for /ɹ/, /w/ for /l/, and /j/ for /l/
Need to determine if the issue is due to articulatory control needed to differentiate the sounds or distinguishing the perceptual representations
Semivowel Acoustics and Speech Development
the acoustics of a [w] in a [w] for /ɹ/ error
the acoustics of a [w] in a [w] for /ɹ/ error (or any other substitution error) are often not like the acoustics of normally articulated [w]
the error [w] is different from correct [w] by having acoustic characteristics more or less between the error sound and the correct sound
This shows that the child hears the difference but has difficulty with articulation
A distinction is made by the child but may be too subtle for human listeners to perceive.
Even if listeners do hear a subtle distinction they may place it in a “comfortable” phoneme category
Semivowel Durations
Challenging to segment
Challenging to segment semivowels from adjacent vowels
When constriction intervals can be segmented from the surrounding transitions they have durations of 30 to 70 ms, with the majority of values toward the lower end of this range
Semivowel Durations
Combined duration of
Combined duration of the transition and constriction intervals of semivowels may be brief (as short as 100 ms)
Suggests rapid, complex articulatory gestures occurring in a short amount of time–may explain, in part, why children master the contrasts of these sounds relatively late in the overall scheme of phonological development
Fricatives
Characterized by an interval of aperiodic energy whose spectrum and overall amplitude depend on place of articulation and, in some cases, voicing status.
In English, fricatives are categorized as sibilants (/s, z,ʃ,ʒ/), nonsibilants (/f, v,θ,ð/), and the glottal fricative /h/
Fricatives
Sibilants are more
Sibilants are more intense and have better-defined spectra than nonsibilants.
Sibilants have more easily identified spectral peaks and concentrations of spectral energy