Final Exam Flashcards
What is the acoustic correlate to a stop’s vocal tract closure?
Stop gap
What are the four articulation cues for stop consonant manner?
Vocal tract closure
Release of the closure
Rapid Articulatory Movements
Rapid opening/closing gestures
What is the acoustic correlate to the stop’s release of the closure?
Stop burst
What is the acoustic correlate to the stop’s rapid articulatory movements?
Relatively fast formant transitions (mostly F1)
What is the acoustic correlate to the stop’s rapid opening/closure gestures?
Rapid rise/fall in intensity
What will an FFT of a voiceless stop gap look like as compared to a stop gap with voicing?
(2)
The voiceless will be blank
The voiced will show evidence of voicing
Are voiced stops aspirated?
No
What are the acoustic cues to stop consonant place of articulation?
(3)
Energy peak in the burst spectrum (unless final and unreleased) - intensity
F2 transitions
Sometimes VOT duration
What is the defining characteristic of labial stop spectra? Where is most of its energy?
Downward slope
Under 600 Hz
What is the defining characteristic of alveolar stop spectra? Where is most of its energy?
Rising slope
Around 3000-4000 Hz
What is the defining characteristic of velar stop spectra?
2
Narrow spectral peaks
F2 is linked to the F2 of the following vowel. Its usually a few hundred Hz higher
What is Lisker’s rule?
That every acoustic movement has some value as an acoustic cue
Why does HL make it hard to hear stop bursts?
The quick transitions
What sort of F2 transition is found for /b/?
Rising
What sort of F2 transition is found in /d/?
Somewhat flat - there is variation
What sort of F2 transition is found in /g/?
Falling
What classic lack of invariance is found in stop consonants?
/d/
It may be interpreted as either /b/, /d/, or /g/
When is variance upheld in stop consonants?
Between /b/ and /g/ - they never get mixed up
What are the four cues to stop voicing in INITIAL position?
VOT
F1 starting position
F1 changes
Voicing during stop gap
What are the four cues to stop voicing in MEDIAL position?
Voicing during stop gap
Duration of stop gap
Length of preceding vowel
F1 transition (if voiced)
What are the four cues to stop voicing in FINAL position?
Voicing during stop gap
Duration of stop gap
Length of preceding vowel
F1 falls (if voiced)
What is the average VOT for /b/?
1 msec
What is the average VOT for /d/?
5 msec
What is the average VOT for /g/?
21 msec
What is the average VOT for /p/?
58 msec
What is the average VOT for /t/?
70 msec
What is the average VOT for /k/?
80 msec
When does F1 start lower: for voiced or voiceless initial stops?
Voiced
What is the difference in stop gap length between medial and final stops?
(2)
In medial position, voiceless stops have longer stop gaps
In final position, voiceless stops have shorter stop gaps
How does the length of the preceding vowel change when it is followed by a voiceless stop versus a voiced one?
Preceding vowels are shorter if followed by a voiceless stop
What happens to F1 at the end of the vocalic portion of a voiced stop in final position?
It falls
For stops in any position, if F1 does not changes, it is most likely a ________ stop.
Voiceless
What is the major voicing cue for fricatives?
2
Voiceless = aperiodic
Voiced = aperiodic + periodic
What do the formants look like in voiced fricatives?
The are flat
What is the spectral peak for /s/?
Around 4500-8000 Hz
What is the spectral peak for /ʃ/?
Around 2500-4500 Hz
Do labiodental, interdental, and glottal fricatives have narrow spectra?
No
What lack of invariance problem is found in fricatives?
Male & female productions of /s/ & /ʃ/ are vastly different
What are the three major place cues for fricatives?
3
Spectra
Amplitude
Formant transitions
Which fricatives tend to have greater amplitude?
3
Stridents
/s/ & /z/
/ʃ/ & /ʒ/
If you lower the amplitude of /s/, what will you percieve?
/θ/
When are formant transitions particularly helpful?
When distinguishing between /f/ and /θ/
What is the formant transition change between /f/ & /θ/?
2
/f/ has a rising F2
/θ/ has a steady-state/constant formant
What is the manner cue for affricates?
Stop burst followed by a sharply rising fricative
What are the two manner cues between fricatives & affricates?
Rise time
Steady state duration
What is the difference in rise time between fricatives & affricates?
Fricatives = 76 msec
Affricate = 33 msec
(2:1 ratio)
What is the difference in steady state duration between fricatives & affricates?
Fricatives = 100 msec
Affricates = 48 msec
(2:1 ratio)
What are the manner cues for liquids & glides?
2
Shape of formant transitions
Length of formant transitions
What distinguishes the length of formant transitions between stops and glides?
Stops have shorter transitions due to more rapid articulatory movement
What are the place cues for glides?
F2 transitions
How do F2 transitions differ between /j/ & /w/?
2
/j/ has a high F2 that falls
/w/ has a low F2 that rises slightly
What are the place cues for liquids?
F3 transitions
How does F3 differ between /l/ & /r/?
/l/ has a steady-state F3
/r/ has sharply rising F3
What are the manner cues for nasals?
5
Nasal murmur (= nasal resonance = nasal formant)
Voicing
Low intensity
Steady state formants
Low frequency resonance
What are the place cues for nasals?
3
Formant transitions
/m/ rises slightly
/n/ & /ŋ/ fall slightly
Are nasals easy to distinguish from one another?
No
What is Wilson’s Rule?
The pulse rate must be 3-5 times the highest frequency you want to resolve
What information does envelope cues give us?
4
Segmentation of syllables & phonemes
Manner of articulation
Strong vs. weak fricatives
Minimal vowel information
What are the six envelope cues?
Stop
Weak Fricatives
Strong Fricatives
Semi-Vowels
Nasal
Vowels
What information does periodicity cues give us?
2
Fricative Manner
Voicing
What information does fine temporal cues give us?
Frequency of F1
If a patient know 126,000 words and can extract 6 envelope features, then this patient can narrow down an utterance into _____ possible word options.
2.4
Is overall consonant manner relatively well-defined?
Yes
Is overall consonant place relatively well-defined?
Not for stops, semivowels, & nasals
What is the McGurk Effect?
That what you see will affect the consonant you perceive
What are the three roles of vision in speech?
Directs attention to the signal and away from background noise
Provides segmental information that is redundant to acoustic information
Provides segmental information which compliments acoustic information (info masked by noise)
How do visual contributions direct attention to the signal and away from background noise?
(3)
Knowing speaker reduces possible acoustic patterns
Helps binaural localization
Lets listener know when the intensity is part of signal or part of noise
What did Sumby & Pollack discover?
Adding a face to a signal in noise is equivalent to a 15 dB improvement in the SNR
How many optical categories are there? How many are necessary?
9
6
We have a success rate of ___% when lip reading alone.
35%
We have a success rate of ___% when pitch is added lip reading.
59%
What envelope is helpful in distinguishing Group 1 (/p/, /t/, & /k/) from other sounds?
Burst envelope
What envelope is helpful in distinguishing Group 2 (/b/, /d/, /g/, /v/, /ð/, /z/, & /ʒ/) from other sounds?
Voicing envelope
What envelope is helpful in distinguishing Group 3 (/f/, /θ/, /s/, & /ʃ/) from other sounds?
Aperiodic-ness
What envelope cues are useful in distinguishing Group 4 (/m/, /n/, /r/, /l/, & /j/) from other sounds?
(2)
Voicing envelope
Amplitude envelope
What is an enveme?
The speech clues given by the envelope
What is a viseme?
The speech cues given visually
If all viseme and enveme information were received by a patient, the ___% of consonant information would be transmitted.
95%
What do lip movements activate?
The primary and secondary auditory areas in the superior temporal cortex
Lip movements can begin ____ msec before the auditory signal.
They activate auditory areas of the cortex _____ auditory stimulation.
100
Before
What is activated when we hear the voices of familiar people?
Fusiform face region
What are two top down effects in speech recognition?
Perceptual restoration
Cohort model
What is perceptual restoration?
A phoneme can be removed and replaced by noise but we still can “hear” it
(/s/ in legislators removed and replaced with a cough)
What is the cohort model?
2
Our lexicon is activated and narrowed in real time with each speech segment we receive
Works like the google search box
When do most words in English become lexically unique?
At the end of the word
What is delayed committment?
Waiting until the maximum amount of information is received before deciding on meaning
What are two temporal processes in word recognition?
“Left to right” activation of cohorts and strategies to delete words in the activated lexicon (as they become “bad fits”)
Delay of decision allowing for the retrograde effects on perceptual decisions
When might we need delayed committment?
3
/f/ vs. /θ/
nasals
Poor audio signals
What is the Metrical Segmentation Strategy?
3
We use the pattern of strong & weak syllables to identify word boundaries
Strong syllables are treated as potential word onsets
What are Lexically-Driven Segmentation strategies?
2
Pragmatic, semantic, & syntactic context
Lexical knowledge
When might we use Lexically-Driven Segmentation strategies?
In Optimal situations
What are Sublexically-Driven Segmentation strategies?
2
Phonotactics, allophones, & coarticulation
Prosody
When might we use Sublexically-Driven Segmentation strategies?
(3)
With poor contextual information
With poor lexical information
With poor segmental information
The auditory cortex is found in ______ but it does not end there.
Herschel’s Gyrus
After the auditory cortex, the speech signal gets sent through what two streams?
Dorsal
Ventral
What is the Dorsal Stream?
3
Acoustic Phonetic Speech Codes ->
Auditory-Motor Interface ->
Articulatory-Based Speech Codes
What is the Ventral Stream?
2
Acoustic-Phonetic Speech Codes ->
Sound-Meaning Interface
Which part of the brain processes the Acoustic-Phonetic Speech Codes? Is it bilateral or unilateral?
Superior Temporal Gyrus
Bilateral
Which part of the brain processes the Auditory-Motor Interface? Is it bilateral or unilateral?
Sylvian Fissure - Parietal-Temporal Boundary
Unilateral - Left
Which part of the brain processes the Articulatory-Based Speech Codes? Is it bilateral or unilateral?
(2 + 1)
Posterior Inferior Frontal Gyrus
Dorsal Premotor Cortex (sensorimotor strip)
Unilateral - Left
Which part of the brain processes the Sound-Meaning Interface? Is it bilateral or unilateral?
Posterior Inferior Temporal Lobe
Unilateral - Left
Is speech processed in multiple areas of the brain?
Yes
Is speech processed bilaterally?
No - it’s mostly on the left
What are the six parts of a cochlear implant?
Microphone
Signal Processer (Chip)
Transmitter
Batteries
Receiver
Electrodes
What is the formula for Hz?
1000 msec / period in msec
What are the 6 tense English vowels?
/i/
/e/
/ɑ/
/ɔ/
/o/
/u/
Speech is what three things?
Visual
Auditory
Tactile