final Flashcards
categorical perception:
The tendency of listeners to perceive speech sounds varied along a continuum according to the phonemic categories of their native language”
identification test
perceptual test in which stimuli are presented separately to be labeled
discrimination test
a type of test in which stimuli are presented in ordered groups. the listener determines similarities and differences among the simuli
Perception of speech
“A specialized aspect of a general human ability to seek and recognize acoustic patterns.”
The redundancy of acoustic cues
Perception of speech sounds that overlap and
are not discrete
Use of context to decode the acoustic message: use of the acoustic information in neighboring sounds
Acoustic cues for vowels,
Most perceptually salient
High intensity with prominent resonances
Longer in duration than consonants
In general, use of F1 and F2 to identify each vowel
- A sufficient cue for back vowels: A single formant with the intermediate frequency of F1 and F2
- F3 is more important for the perception of front vowels than that of back vowels
Variation in steady-state formant frequencies due to:
• Various vocal tract sizes
• Effect of context: e.g., /ɑ/ in far vs. stop
• Effect of rate of articulation
− Neutralization of vowels with increased speaking rate
− Constant changes in articulation at normal conversational rate
Use of patterns rather than the actual values of formant frequencies
Use of formant transitions
• more accurate identification of vowels with consonant-to-vowel (CV) or vowel-to-consonant (VC) transitions than isolated vowel transitions
• Use of information from F3 and from f0
acoustic cues for glides/liquids
Perception of Semivowels
Differentiated by more rapid formant transitions compared to vowels
F1 and F2 to perceive glides
F2 to distinguish/w/from/j/
F1, F2 and F3 to perceive liquids F3 todistinguish/r/from/l/
Figure 10.2, 10.3
acoustic cues for nasals
Perception of nasal manner
• A weakening of intensity in the upper formants due
to antiresonances
- A resonance added below 500 Hz (often 200-300 Hz): nasal murmur
- The nasalization of vowels
Perception of place of articulation by formant transitions
• Frequency and duration from and to: /m/ < /n/ < /ŋ/ • Most variable in frequency for /ŋ/
acoustic cues for stops
Extension of formant transition makes stops (< 40 ms) perceived as glides (40 or 50 ms) and glides as a sequence of vowels ( > 150 ms): Figure 10.4
e.g., /bɛ/ and /gɛ/ → /wɛ/ and /jɛ/ → /uɛ/ and /iɛ/
The frequency of the most intense portion of the
burst + F2 transition to or from a neighboring vowel
- High-frequency bursts + vowels: perceived as /t/
- Low-frequency bursts + vowels: perceived as /p/
- Bursts slightly above the frequency of F2 + vowels: perceived as /k/
Rapid F2 transition only ( in C+ /a/ context): Figure 10.6
* Rising: perceived as labial /p, b/ * Slightly falling: perceived as alveolar /t, d/ • Sharply falling: perceived as velar /k, g/
place cues of stops
Locus frequency (F2 transition pointing to a particular locus) • /b/: 720 Hz • /d/: 1,800 Hz • /g/: 3,000 Hz F3 also serves as a cue Figure 10.7
place cues for fricatives
Place cues
•1 Spectral (frequency) differences
− The sibilants (/s, z, ʃ, ʒ/) with relatively steep, high- frequency spectral peaks
− The nonsibilants (/f, v, θ, ð/) with relatively flat spectra
− around 4 kHz for /s, z/ vs. 2.5 kHz for /ʃ, ʒ/
− /f, v/ vs. /θ, ð/: not as reliably differentiated
Place cues
• 2 Intensity (amplitude) differences
− The sibilants (/s, z, ʃ, ʒ/) with high-intensity levels
− The nonsibilants (/f, v, θ, ð/) with low-intensity levels
•3 Formant transitions (F2 and F3)
− Less important for fricatives than for stops
− More important for perception of /f, v/ or /θ, ð/ than sibilants /s, z/ or /ʃ, ʒ/ (noise portion as a sufficient cue)
voice cues for fricatives
Voicing cues
• The presence or absence of phonation
• The duration of frication relative to the duration of the preceding vowel
−e.g., use(/jus/vs./juz/)
− /s/ is perceived with longer frication duration
• Reduced intensity for voiced fricatives
voicing cues for stops
Voicing Cues of Stops
The presence or absence of voice bar during stop closure
The presence or absence of aspiration
Voice onset time (VOT)
F1 cutback (the delay of F1 relative to F2 onset: Figure 10.8
• F1 cutback of 30 ms or more: perceived as voiceless
• VOT and F1 cutback are greater for velar stops than for labial and alveolar stops
locus frequency for stops
Locus frequency (F2 transition pointing to a particular locus) • /b/: 720 Hz • /d/: 1,800 Hz • /g/: 3,000 Hz F3 also serves as a cue Figure 10.7
Figure 10.9
go book
10.10,
go book