10 - Speech Perception Flashcards
What is Articulatory Phonetics?
Study of speech sounds through the structures of the mouth
Vowels have ______ air flow.
Unobstructed
Consonants have ______ airflow.
Obstructed
What is Acoustic Phonetics?
The study of the acoustic aspects of speech sounds
What are Acoustic Aspects of sound?
The physical properites (freq, duration, intensity, etc.)
Why do we use Spectrograms?
To study the acoustics of speech
What is Parallel transmission?
That we can perceive the difference in phonemes, but in reality there is no clear break between them
(Phonemes are encoded continuously, at the same time)
What is the Segmentation Problem?
It’s acoustically hard to tell where words begin and end yet we have no problem perceiving the words
Other languages may sound fast to our ears but, in our own language, we have no problem understanding words when people are actually speaking extremely quickly
What is the Lack of Invariance Problem?
There is no one-to-one correspondence between the acoustic cues and the phonemes perceived
One phoneme may have many different acoustic cues.
What is the psychological definition of a phoneme?
A category of sounds that we perceive to be the same sound.
Why is there variation in phoneme production?
Coarticulation
Variability between speakers
Variability within speakers
What is Coarticulation?
Overlapping articulation of phonemes
How we say a word is affected by what comes before and after it
Why is there variability between speakers?
Gender
Pitch
Accent
Speed
Age
Why is there variability within speakers
People are sloppy speakers
They often say things slightly differently
What experiment did Pollack & Pickett do in 1964?
Cut up actual conversation (continuous speech)
Either played words in context or cut them out of context
Are words in context easy to understand?
Yes
Are words cut out of continuous speech easy to understand?
No
How do we perceive the difference between two tones? How do we label this?
As tones get closer and closer in pitch, we begin to lose our ability to discriminate between them and they begin to sound the same
Perception of non-speech is continuous
What is VOT?
Voice Onset Time
Time between the consonant release and the voicing start
Which has a higher VOT: voiced stops or voiceless stops?
Voiceless
Is the perception of speech continuous?
No
There is not place where voiced and voiceless sounds blend and sound the same
What do we call the place where we all perceive the change from voiced to voiceless?
Phonemic Boundary
Our perception of consonants is _______.
Categorical
What is Categorical Perception?
That we perceive consonants at either one or the other. They never sound the same
Why do vowels not show categorical perception?
Vowels occur over a longer timespan
We don’t need to identify them as quickly
What is the Motor Theory of Speech Perception?
We use our knowledge of production to understand speech
We can “feel” the movements of someone else’s speech so we know what they are saying
Perception is based on articulatory information, not just the signal
What does the Motor Theory of Speech Perception address?
The lack of invariance problem
What is the McGurk Effect?
If you hear /ba/, but see /ga/, you will perceive /da/
Your perception is compromised between what is being heard and what is being seen
What could explain the McGurk Effect?
Motor Theory of Speech Perception
Top Down Processing
What does successful speech perception depends on?
Bottom-up processing of acoustic info
Top-down processing by using context, semantics, and syntactic info
What are some Top Down Effects in speech processing?
6
Context Effect
Illusions
Phonemic Restoration
Verbal Transformation Effect
Sinewave Speech
Backward Speech
What is the Context Effect?
Presented sentences in extreme noise
The more “sensical” the sentence was, the better it was able to be heard
“Accidents kill motorists on the highway” / “Accidents carry honey between the house” / “Around accidents country honey the school”
Who came up with the Context Effect? When?
Miller & Isard
1963
What was the Illusions Experiment?
Nonsense audio stimulus was presented through a low-pass filter
Participants turned this into something they could understand due to top down processing
“Pooh kluss free soub eatwull size” => “Two plus three should equal five”
Who came up with the Illusions Experiment? When
Miller
1956
What is Phonemic Restoration?
People still hear/perceive the missing phoneme ( /s/ ) even when they were told that it was missing
Who came up with Phonemic Restoration? When?
Warren
1970
What stimuli did Warren use in his Phonemic Restoration experiment?
“The state governors met with their respective legicoughlatures conveining in their capital city”
"It was found that the *eel..." "..was on the axle" Heard "wheel" "..was on the shoe" Heard "heel" "..was on the orange" Heard "peel" "..was on the table" Heard "meal"
When do we often experience phonemic restoration in our everyday lives?
On the telephone
A lot of /s/ sounds are cut off but our brains fill them in
Is phonemic restoration easier when the phoneme is obscured by silence or by noise?
Noise
What is the Verbal Transformation Effect?
When you hear words over and over again, they can start to sound different
Farewell -> Welfare
Ace -> Say
What is Sinewave Speech?
If we replace speech with tones that match the basic formants, we will hear words in the non-speech signals
Who came up with Sinewave Speech?
Remez
What is Backward Speech?
Speech signal is split into various segments then these are reversed individually
As the segments get shorter, it becomes easier to resolve the backward speech
What are three common Top-Down Effects experienced in everyday speech?
Hearing messages in songs played backwards
Understanding foreign accents
Children mishearing things (What are electrical votes?)
What is the Cohort Model?
1ST STAGE: you select a wide cohort based on phonetic info (bottom up)
“Jerry saw a d…”
Cohort = deck, deal, dog, etc.
2ND STAGE: you narrow the cohort based on more info and other variables (e.g., frequency of occurances)
“JErry saw a do….”
Cohort = dog, dock, doll, etc.
3RD STAGE: You fit the item into the context
“Jerry saw a dog barking in the park”
No better options
Who came up with the Cohort Model? When?
Maslen-Wilson
1987, 1990
What do we call the point where the word becomes unique?
Recognition point
Chrysan -> chrysanthemum
Eleph -> elephant
Do you need to hear the whole word for it to be recognizable
No
How fast does the Cohort Model occur?
Extremely quickly. We are not aware that it is happening
What online resource works like the Cohort Model?
Google search
What is the TRACE Model?
A connectionist model of speech perception
Words are represented across different levels: words, phonemes, features
These levels interact with each other
You hear a sound then you sort through the features -> the phonemes -> find the word
Bottom-up with some top-down added in
Who came up with the TRACE Model?
McClelland & Elman
1986
What is the TRACE Model also called?
Parallel Distributed Processing Model (PDP)
What has trouble with speech perception? Why?
Computers also have problems with accents and speech errors
Lack of invariance problem causes trouble for computer
What do prosodic factors affect?
The overall utterance meaning
What are prosodic factors also called?
Suprasegmentals
What are prosodic factors?
Stress
Intonation
Tone
Rate or Length
Pausing
What is Stress?
The emphasis given to syllables and words (longer, louder, higher in Pitch
Can distinguish words (reJECT vs. REject; PRESent vs. preSENT)
Can words be hard to understand if we put the stress on the wrong syllable?
Yes
What is Intonation?
The use of pitch over phrases (Got the keys? vs. Got the keys!)
Usually raised for yes/no questions but falling for Wh- questions
Misunderstands in emails often occur due to the lack of intonation and texting is even worse than email
What is Tone?
Use of pitch over words
Not as prominent in English but found in tonal languages, such as Chinese
What is Rate or Length in Speech?
Speed of speech
Can change meaning of word itself in some language (Spanish = pero (but) vs. perro (dog))
Can alter meaning somewhat in English (excellent vs. eeexceleeent)
What is Pausing?
“coffee cake and honey” vs. “coffee, cake, and honey”
“106” vs. “100 and 6”
Why does artificial speech and “cut-up speech” sound strange?
It often lacks proper prosody
Phone numbers, auto reminder calls, voice menus
What are 3 Theories of Speech Perception?
Motor Theory of Speech Perception
Cohort Model
TRACE Model