Speech Recognition Flashcards by Chloe Lappen

what are acoustic phonetics?

study of the physical properties of speech

How well did you know this?

Not at all

Perfectly

what is sound?

a vibration that propagates as an acoustic wave
(based on the perception of its characteristics)

How well did you know this?

Not at all

Perfectly

what is frequency?

the number of times per second a sound wave cycles from the highest to the lowest point.

How well did you know this?

Not at all

Perfectly

what is amplitude?

height of the wave
taller the wave = louder the sound

How well did you know this?

Not at all

Perfectly

what is a sound spectrogram?

is a visual representation of the spectrum of frequencies of sound

How well did you know this?

Not at all

Perfectly

axis’s of sound spectrogram?

frequency of sound on vertical axis, time on horizontal axis, intensity shown by darkness

How well did you know this?

Not at all

Perfectly

sound spectrogram formats?

dark bands (i.e., most intensity)
▪ Steady state formant (stays same over time)
▪ Formant transitions (changes over time)

How well did you know this?

Not at all

Perfectly

problems posed for speech recognition?

lack of invariance
problem in speaker variability
segmentation problem

How well did you know this?

Not at all

Perfectly

what is lack of invariance?

no one-to-one correspondence between speech cues and perception

How well did you know this?

Not at all

Perfectly

what is the problem in speaker variability?

People differ in production of speech sounds –across people and occasions

How well did you know this?

Not at all

Perfectly

what is a segmentation problem?

people typically do not leave breaks between words when speaking.

How well did you know this?

Not at all

Perfectly

what is categorical perception?

We do not discriminate sounds within a phonemic category
- ex: we classify speech sounds as one phoneme or another

How well did you know this?

Not at all

Perfectly

modularity (revisited) and categorical perception?

Some people have taken categorical perception as evidence for a speech perception module
chinchillas show categorical perception

How well did you know this?

Not at all

Perfectly

what are some speech segment strategies?

possible word constraint: tendency to segment speech so that each segment is a possible word
Bilingual speakers tend to use strategies that are consistent with their dominant language

How well did you know this?

Not at all

Perfectly

what does context and speech recognition involve? when are people better at identifying words?

people are better at identifying words when presented in sentences than when presented in isolation
speech recognition involves bottom-up and top-down processing

How well did you know this?

Not at all

Perfectly

what are the two views of context and speech recognition?

Study These Flashcards

Autonomous view: context has effect after lexical access
Interactionist view: allows context to affect earlier (lower) levels of processing

context and speech recognition examples?

Study These Flashcards

When shadowing, Ps have a tendency to correct speech errors (e.g., Marslen-Wilson & Welsh)
Phonemic restoration effect (e.g., Warren)
Semantic & syntactic factors in speech perception (e.g., Miller & Isard)

what does it mean when shadowing, Ps have a tendency to correct speech errors?

Study These Flashcards

Ps are more likely to correct when the..
- context is highly predictable (role of semantic & syntactic factors)
- presented phoneme differed from the target phoneme by fewer distinctive features (role of bottom-up processing = gar v. car)

what is the phonetic restoration effect?

Study These Flashcards

the illusion that a phoneme deleted from a string of speech is actually there.
(ex: coughing replacing letter)

what are semantic & syntactic factors in speech perception?

Study These Flashcards

Ps shadow word strings in varying degrees of background noise
- 3 types of word strings:
* grammatical
* anomalous
* ungrammatical

what is prosody?

Study These Flashcards

tune and rhythm of speech

Prosodic factors in speech recognition?

Study These Flashcards

Stress
speech rate
characteristics of individual speakers

what is the mcgurk effect?

Study These Flashcards

Hear /ba/
See /ga/
Both –perceive /da/
*** Importance of visual & auditory information for speech perception

what are the two models of speech recognition?

Study These Flashcards

cohort
TRACE

what is cohort?

Spoken word recognition occurs in stages: 1. Access stage: set up initial cohort, strictly bottom-up process 2. Selection stage: words are eliminated from cohort until 1 item is left 3. Integration stage: the selected item is integrated into the representation of the utterance

what is the selection stage based on?

- additional phonemic information - context of spoken sentence (early version of model)

what part of the word do we pay more attention to?

the beginning - supported through selection stage

what is the TRACE model?

take all of the various sources of information found in speech and integrate them to identify single words.

what are connectionist models composed of?

- they contain a system of interconnected nodes - they have excitatory (facilitatory) and inhibitory connections between nodes - processing is massively parallel - has both top-down and bottom-up processing

what is lexical access?

the retrieval of words from the mental lexicon, both in recognition and in production.

what is the TRACE model composed of?

- connectionist model - 3 levels in the network: word, phoneme, and feature - incorporates top-down effects on the activation of features

what is lexicon?

the vocabulary of a person, language, or branch of knowledge.

examples of lexicon?

"No-hitter," "go-ahead run," and "Baltimore chop" are part of the baseball lexicon.

example of prosidy?

"Yeah, that was a great movie," can mean that the speaker liked the movie or the exact opposite, depending on the speaker's intonation.

what is coarticulation (lack of invariance)?

process of articulating more than one phoneme at a time

what is segmentation?

the process of dividing the speech signal into component words

example of segmentation?

when we spell the word dog, we separate it into its three separate sounds: /d/-/o/-/g/

cohort model example?

discriminating between Crocodile and Dial, the point of recognition to discriminate between the two words comes at the /d/ in crocodile which is much earlier than the /l/ sound in Dial.

Speech Recognition Flashcards

(38 cards)