Speech Recognition Flashcards

You may prefer our related Brainscape-certified flashcards:
1
Q

what are acoustic phonetics?

A

study of the physical properties of speech

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what is sound?

A

a vibration that propagates as an acoustic wave
(based on the perception of its characteristics)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

what is frequency?

A

the number of times per second a sound wave cycles from the highest to the lowest point.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what is amplitude?

A

height of the wave
taller the wave = louder the sound

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what is a sound spectrogram?

A

is a visual representation of the spectrum of frequencies of sound

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

axis’s of sound spectrogram?

A

frequency of sound on vertical axis, time on horizontal axis, intensity shown by darkness

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

sound spectrogram formats?

A

dark bands (i.e., most intensity)
▪ Steady state formant (stays same over time)
▪ Formant transitions (changes over time)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

problems posed for speech recognition?

A
  • lack of invariance
  • problem in speaker variability
  • segmentation problem
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

what is lack of invariance?

A

no one-to-one correspondence between speech cues and perception

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what is the problem in speaker variability?

A

People differ in production of speech sounds –across people and occasions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

what is a segmentation problem?

A

people typically do not leave breaks between words when speaking.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

what is categorical perception?

A

We do not discriminate sounds within a phonemic category
- ex: we classify speech sounds as one phoneme or another

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

modularity (revisited) and categorical perception?

A
  • Some people have taken categorical perception as evidence for a speech perception module
  • chinchillas show categorical perception
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

what are some speech segment strategies?

A
  • possible word constraint: tendency to segment speech so that each segment is a possible word
  • Bilingual speakers tend to use strategies that are consistent with their dominant language
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

what does context and speech recognition involve? when are people better at identifying words?

A
  • people are better at identifying words when presented in sentences than when presented in isolation
  • speech recognition involves bottom-up and top-down processing
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

what are the two views of context and speech recognition?

A
  • Autonomous view: context has effect after lexical access
  • Interactionist view: allows context to affect earlier (lower) levels of processing
17
Q

context and speech recognition examples?

A
  • When shadowing, Ps have a tendency to correct speech errors (e.g., Marslen-Wilson & Welsh)
  • Phonemic restoration effect (e.g., Warren)
  • Semantic & syntactic factors in speech perception (e.g., Miller & Isard)
18
Q

what does it mean when shadowing, Ps have a tendency to correct speech errors?

A

Ps are more likely to correct when the..
- context is highly predictable (role of semantic & syntactic factors)
- presented phoneme differed from the target phoneme by fewer distinctive features (role of bottom-up processing = gar v. car)

19
Q

what is the phonetic restoration effect?

A

the illusion that a phoneme deleted from a string of speech is actually there.
(ex: coughing replacing letter)

20
Q

what are semantic & syntactic factors in speech perception?

A

Ps shadow word strings in varying degrees of background noise
- 3 types of word strings:
* grammatical
* anomalous
* ungrammatical

21
Q

what is prosody?

A

tune and rhythm of speech

22
Q

Prosodic factors in speech recognition?

A
  • Stress
  • speech rate
  • characteristics of individual speakers
23
Q

what is the mcgurk effect?

A
  • Hear /ba/
  • See /ga/
  • Both –perceive /da/
    *** Importance of visual & auditory information for speech perception
24
Q

what are the two models of speech recognition?

A
  • cohort
  • TRACE
25
Q

what is cohort?

A

Spoken word recognition occurs in stages:
1. Access stage: set up initial cohort, strictly bottom-up process
2. Selection stage: words are eliminated from cohort until 1 item is left
3. Integration stage: the selected item is integrated into the representation of the utterance

26
Q

what is the selection stage based on?

A
  • additional phonemic information
  • context of spoken sentence (early version of model)
27
Q

what part of the word do we pay more attention to?

A

the beginning
- supported through selection stage

28
Q

what is the TRACE model?

A

take all of the various sources of information found in speech and integrate them to identify single words.

29
Q

what are connectionist models composed of?

A
  • they contain a system of interconnected nodes
  • they have excitatory (facilitatory) and inhibitory connections between nodes
  • processing is massively parallel
  • has both top-down and bottom-up processing
30
Q

what is lexical access?

A

the retrieval of words from the mental lexicon, both in recognition and in production.

31
Q

what is the TRACE model composed of?

A
  • connectionist model
  • 3 levels in the network: word, phoneme, and feature
  • incorporates top-down effects on the activation of features
32
Q

what is lexicon?

A

the vocabulary of a person, language, or branch of knowledge.

33
Q

examples of lexicon?

A

“No-hitter,” “go-ahead run,” and “Baltimore chop” are part of the baseball lexicon.

34
Q

example of prosidy?

A

“Yeah, that was a great movie,” can mean that the speaker liked the movie or the exact opposite, depending on the speaker’s intonation.

35
Q

what is coarticulation (lack of invariance)?

A

process of articulating more than one phoneme at a time

36
Q

what is segmentation?

A

the process of dividing the speech signal into component words

37
Q

example of segmentation?

A

when we spell the word dog, we separate it into its three separate sounds: /d/-/o/-/g/

38
Q

cohort model example?

A

discriminating between Crocodile and Dial, the point of recognition to discriminate between the two words comes at the /d/ in crocodile which is much earlier than the /l/ sound in Dial.