There is no simple correspondence between the acoustic signal and individual phonemes - Variability comes from a phoneme’s context - Acoustic signals that vary can be perceived categorically (/b/ as in ‘bat’ or ‘bite’) i.e., as ‘the same’ sound

Topic 6 - Speech Perception Flashcards by Jasmine Dimmock

Acoustic Signal

Produced by air that is pushed up from the lungs through the vocal cords and into the vocal tract

How well did you know this?

Not at all

Perfectly

Vowels

Vowels are produced by vibration of the vocal cords and changes in the shape of the vocal tract by moving the articulators

These changes in shape cause changes in the resonant frequency of the vocal tract and produce peaks in pressure at a number of frequencies called formants

Each vowel has a characteristic series of ‘formants’ (resonant frequencies)

How well did you know this?

Not at all

Perfectly

Articulators

structures such as tongue, lips, teeth, jaw and soft palate

How well did you know this?

Not at all

Perfectly

Formants

Resonant Frequencies

The first formant has the lowest frequency, the second has the next highest, etc.

How well did you know this?

Not at all

Perfectly

Sound/Speech spectrograms

Sound/Speech spectrograms are a better way to show the changes in frequency and intensity for speech

How well did you know this?

Not at all

Perfectly

Consonants

Produced by a constriction of the vocal tract and air flow around articulators

How well did you know this?

Not at all

Perfectly

Phoneme

smallest unit of speech that changes meaning of a word

In English there are 47 phonemes - 13 major vowel sounds and 24 major consonant sounds

How well did you know this?

Not at all

Perfectly

Variability problem

There is no simple correspondence between the acoustic signal and individual phonemes

Variability comes from a phoneme’s context
Acoustic signals that vary can be perceived categorically (/b/ as in ‘bat’ or ‘bite’) i.e., as ‘the same’ sound

How well did you know this?

Not at all

Perfectly

Coarticulation

overlap between articulation of neighbouring phonemes also causes variation

How well did you know this?

Not at all

Perfectly

Variability between different speakers

Speakers differ in pitch, accent, speed in speaking, and pronunciation

This acoustic signal must be transformed into familiar words
People perceive speech easily in spite of the variability problems due to perceptual constancy

How well did you know this?

Not at all

Perfectly

Categorical perception

This occurs when a wide range of acoustic cues results in the perception of a limited number of sound categories

How well did you know this?

Not at all

Perfectly

Voice onset time (VOT)

time delay between when a sound starts and when voicing begins (vocal chords begin to vibrate)

How well did you know this?

Not at all

Perfectly

VOT experiment Eimas & Corbit

VOT for /da/ is 17ms (short) and /ta/ is 91ms (long)

Computers were used to create stimuli with a range of VOTs from long to short

Listeners do not hear the incremental changes, instead they hear a sudden change from /da/ to /ta/ at the phonetic boundary

Thus, we experience perceptual constancy for the phonemes within a given range of VOT

How well did you know this?

Not at all

Perfectly

The McGurk Effect

Visual stimulus shows a speaker saying “ga-ga.”

Auditory stimulus has a speaker saying “ba-ba.”

Observer watching and listening hears “da-da”, which is the midpoint between “ga” and “ba.”

Observer with eyes closed will hear “ba.”

How well did you know this?

Not at all

Perfectly

Physiological link between vision and speech

Calvert et al. showed that the same brain areas are activated for lip reading and speech perception.

How well did you know this?

Not at all

Perfectly

FFA activation

Study These Flashcards

Von Kreigstein et al. showed that the FFA is activated when listeners hear familiar voices

This shows a link between perceiving faces and voices.

Experiment by Rubin et al.

meaning and phoneme

Study These Flashcards

Short words (sin, bat, and leg) and short nonwords (jum, baf, and teg) were presented to listeners.

The task was to press a button as quickly as possible when they heard a target phoneme

On average, listeners were faster with words (580 ms) than non-words (631 ms)

Experiment by Warren

meaning and phoneme

Study These Flashcards

Listeners heard a sentence that had a phoneme covered by a cough

The task was to state where in the sentence the cough occurred.

Listeners could not correctly identify the position and they also did not notice that a phoneme was missing - called the phonemic restoration effect

This did not happen for non-word sentences

Experiment by Miller and Isard

meaning and phoneme

Study These Flashcards

Stimuli were three types of sentences:

- Normal grammatical sentences 
- Anomalous sentences (made no sense) that were grammatical
- Ungrammatical strings of words

Listeners were to shadow (repeat aloud) the sentences as they heard them through headphones

Results showed that listeners were – 89% accurate with normal sentences – 79% accurate for anomalous sentences – 56% accurate for ungrammatical word strings

Segmentation Problem

Study These Flashcards

there are no physical breaks in the continuous acoustic signal

Top-down processing, including knowledge a listener has about a language, affects perception of the incoming speech stimulus

Segmentation is affected by context, meaning, and our knowledge of word structure

Speech Segmentation

Study These Flashcards

the perception of individual words in a conversation

Word Structure, Segmentation

Study These Flashcards

Transitional probabilities - the chance that one sound will follow another in a language

Statistical learning - the process of learning transitional probabilities and other language characteristics

Statistical Learning experiment by Saffran et al.

Study These Flashcards

Learning phase - infants heard nonsense words in two-minute strings of continuous sound that contained transitional probabilities

Nonsense words were in random order within the string

If infants use transitional probabilities, they should recognize the words as units even though the string of words had no breaks

Indexical characteristics

Study These Flashcards

characteristics of the speaker’s voice such as age, gender, emotional state, level of seriousness, etc

Speaker characteristics experiment by Palmeri et al.

Listeners were to indicate when a word was new in a sequence of words. Results showed that they were much faster if the same speaker was used for all the words, than when a different speaker was used for each

Broca's aphasia

Individuals have damage in Broca’s area in frontal lobe Laboured and stilted speech and short sentences but they understand others

Wernicke's aphasia

individuals have damage in Wernicke’s area in temporal lobe Speak fluently but the content is disorganized and not meaningful They also have difficulty understanding others and word deafness may occur in extreme cases

Brain damage

Some patients with brain damage can discriminate words but are unable to discriminate syllables (and vice versa)

Brain scans found:

A “voice area” in the STS that is activated more by voices than other sounds A ventral stream for recognizing speech and a dorsal stream that links the acoustic signal to movements for producing speech - called the dual stream model of speech perception

Experience dependent plasticity

Before age one, human infants can tell difference between sounds that create all languages The brain becomes “tuned” to respond best to speech sounds that are in the environment Other sound differentiation disappears when there is no reinforcement from the environment

Topic 6 - Speech Perception Flashcards

(30 cards)