01 - Speech Production, Perception, Phonetics Flashcards

1
Q

What is an end-to-end ML model?

A

A ML model designed to directly map the input to the output, without relying on multiple stages or components.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the ethics for speech technologies?

A

Don’t record, don’t clone, consider

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the main use of the respiratory system?

A

Breathing and to control the air pressure in speech production

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the main use of the phonatory system?

A

As a system of throat valves and protective cartilage it stimulates the oral and nasal cavities to produce sounds (aka the human voice)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the main use of the articulatory system?

A

It uses the upper vocal tract to generate stimulation of the oral cavity and shape the spectral content of the voice. Aka it is how we articulate and pronounce different sounds.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How are vowel sounds produced?

A

By vibrating the vocal chords with no obstruction of air flow.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is a node in a standing wave?

A

The place of no displacement.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is an antinode in a standing wave?

A

The place of most displacement.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

If the first frequency is 500 (F1 = 500) what are F2, F3 and F4?

A

1500 (F1x3), 2500 (F1x5) and 3500 (F1x7)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is a consonant sound?

A

A sound produced by obstructing or restricting the air flow in the vocal tract.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is decibel (dB) measuring?

A

It measures the ratio between two values of power on the logarithmic scale.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the Mel Frequency Scale?

A

It is a perceptually motivated frequency scale based on the human auditory system’s response to sound.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is a phone and what is a phoneme?

A

A phone is a unit of sound produced by the human vocal apparatus. A phoneme is a unit of sound that distinguishes a word from another (in a given language).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What does the IPA stand for?

A

It stands for the International Phonetic Alphabet. NOT to be confused with the NATO Phonetic Alphabet (which is alpha, beta, charlie etc.)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

IPA has two subsets in this class. They are?

A

ARPAbet = General American English
SAMPA = European Portuguese, English English, and GA English

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Transcribe the following words into British English using the IPA: dark, suit, greasy, wash, water

A
  1. dark [dɑːk]
  2. suit [sjuːt]
  3. greasy [ˈɡriːzi]
  4. wash [wɒʃ ]
  5. water [ˈwɔːtə]
17
Q

Transcribe the following words into American English using the IPA: dark, suit, greasy, wash, water

A
  1. dark [dɑrk]
  2. suit [sut]
  3. greasy [ˈɡrisi]
  4. wash [wɑʃ]
  5. water [ˈwɔtər]
18
Q

We classify speech sounds into 3 categories. They are?

A

Voicing, manner of articulation and place of articulation

19
Q

What is the role of ‘voicing’ in classification of speech sounds?

A

A sound can be voiced or unvoiced:
Voiced sounds:
- “Zoo” - The ‘z’ sound in “zoo” is voiced.
- “Bag” - The ‘b’ sound in “bag” is voiced.

Voiceless sounds:
- “Cat” - The ‘c’ sound in “cat” is voiceless.
- “Pop” - The ‘p’ sound in “pop” is voiceless.

In each example, the voiced sounds involve vocal cord vibration, while the voiceless sounds do not.

20
Q

What is the role of ‘manner of articulation’ in classification of speech sounds?

A

Consonants:
1. Stop: Consonants produced by completely obstructing the airflow and then releasing it abruptly. Stops are also known as plosives. Examples include /p/, /b/, /t/, /d/, /k/, and /g/.

  1. Fricative: Consonants produced by creating a narrow opening in the vocal tract, causing the airflow to pass through with friction and resulting in a continuous sound. Examples include /f/, /v/, /s/, /z/, /sh/, and /zh/ (as in “measure”).
  2. Lateral: Consonants produced by creating a partial closure in the vocal tract, allowing the airflow to pass along the sides of the tongue. The /l/ sound in English is a lateral consonant.
  3. Nasal: Consonants produced by lowering the velum, allowing the sound to pass through the nasal cavity. Examples include /m/, /n/, and /ng/ (as in “sing”).
  4. Semi-vowel: Also known as a glide, semi-vowels function as consonants but have vowel-like qualities. They are produced with a relatively quick and smooth movement of the articulatory organs. Examples include /w/ (as in “well”) and /y/ (as in “yes”).

Vowels (in general):
Vowels are speech sounds produced with an open vocal tract, allowing the airflow to pass freely without significant constriction. They are characterized by the position of the tongue and lips. Vowels are described based on several dimensions:

  1. Height: Describes the vertical position of the tongue in the mouth. It can be classified as high (/i/, /u/), mid (/e/, /o/), or low (/a/).
  2. Backness: Describes the horizontal position of the highest point of the tongue. It can be classified as front (/i/, /e/), central (/ə/), or back (/u/, /o/).
  3. Roundness: Describes whether the lips are rounded or unrounded during vowel production. It can be classified as rounded (/u/, /o/) or unrounded (/i/, /e/).
21
Q

What is the role of ‘place of articulation’ in classification of speech sounds?

A

Consonants:
1. Labial: Consonants produced using the lips. Examples include /p/ and /b/ (bilabial), /f/ and /v/ (labiodental).

  1. Dental: Consonants produced with the tongue tip placed against or near the upper teeth. An example is the “th” sound as in “thin” or “that” (/θ/ and /ð/).
  2. Alveolar: Consonants produced with the tongue tip or blade against the alveolar ridge, located behind the upper front teeth. Examples include /t/, /d/, /s/, /z/, /n/, and /l/.
  3. Palatal: Consonants produced with the tongue touching or approaching the hard palate. Examples include /j/ (as in “yes”) and the “sh” sound (/ʃ/) in “she”.
  4. Velar: Consonants produced with the back of the tongue against the soft part of the roof of the mouth (velum). Examples include /k/, /g/, and /ŋ/ (as in “sing”).
  5. Uvular: Consonants produced with the back of the tongue against the uvula. Examples include some sounds in languages like French and Arabic, such as the “r” sound in “Paris” (/ʁ/).
  6. Glottal: Consonants produced at the level of the glottis, the opening between the vocal cords. An example is the glottal stop (/ʔ/) as in the Cockney English pronunciation of “butter.”

Vowels:
1. Front/Central/Back: Describes the position of the highest point of the tongue during vowel production. Front vowels (/i/, /e/, /ɛ/) are produced with the front of the tongue raised, central vowels (/ə/) with the tongue in a neutral position, and back vowels (/u/, /o/, /ɔ/) with the back of the tongue raised.

  1. Close/Mid/Open: Describes the degree of openness of the mouth during vowel production. Close vowels (/i/, /u/) are produced with a relatively small degree of mouth opening, mid vowels (/e/, /o/) with a moderate degree, and open vowels (/a/, /ɑ/) with a wide degree of opening.
  2. Roundness: Describes whether the lips are rounded or unrounded during vowel production. Rounded vowels (/u/, /o/, /ɔ/) involve lip rounding, while unrounded vowels (/i/, /e/, /ɑ/) do not.

These concepts help linguists classify and describe the articulatory features of speech sounds, contributing to our understanding of the phonetic and phonological systems of languages.

22
Q

What is the place of articulation of the consonant ‘v’?
1. labial
2. dental
3. palatal
4. velar

A
  1. labial

Think like this:
Labial for Lips
Dental for Teeth
Palatal for PalaceROOF (the roof of the tongue)
Velar for Valley (furthest back in mouth)

23
Q

What are the 3 elements of ‘prosody’?

A

Rhythm (timing), stress and intonation

24
Q

What is the role of prosodic boundaries?

A

They seperate prosodic units. They infer the rhythm of the a sentence. For example: ‘Mary | and John | went to the cinema ||.’

25
Q

Summary question: Spoken language communication is the?

A

Exchanging information with words and sounds.

26
Q

Summary question: Speech production is the?

A

Generation of speech sounds with the human body

27
Q

Summary question: Speech perception is the?

A

The making of sense of the spoken language (understanding).

28
Q

Summary question: Phonetics is the?

A

Classification of speech sounds.

29
Q

Summary question: Prosody is the?

A

The rhythm, stress and intonation of speech.