01 - Speech Production, Perception, Phonetics Flashcards

Question 1

Q

What is an end-to-end ML model?

Answer

A

A ML model designed to directly map the input to the output, without relying on multiple stages or components.

Question 2

Q

What are the ethics for speech technologies?

Answer

A

Don’t record, don’t clone, consider

Question 3

Q

What is the main use of the respiratory system?

Answer

A

Breathing and to control the air pressure in speech production

Question 4

Q

What is the main use of the phonatory system?

Answer

A

As a system of throat valves and protective cartilage it stimulates the oral and nasal cavities to produce sounds (aka the human voice)

Question 5

Q

What is the main use of the articulatory system?

Answer

A

It uses the upper vocal tract to generate stimulation of the oral cavity and shape the spectral content of the voice. Aka it is how we articulate and pronounce different sounds.

Question 6

Q

How are vowel sounds produced?

Answer

A

By vibrating the vocal chords with no obstruction of air flow.

Question 7

Q

What is a node in a standing wave?

Answer

A

The place of no displacement.

Question 8

Q

What is an antinode in a standing wave?

Answer

A

The place of most displacement.

Question 9

Q

If the first frequency is 500 (F1 = 500) what are F2, F3 and F4?

Answer

A

1500 (F1x3), 2500 (F1x5) and 3500 (F1x7)

Question 10

Q

What is a consonant sound?

Answer

A

A sound produced by obstructing or restricting the air flow in the vocal tract.

Question 11

Q

What is decibel (dB) measuring?

Answer

A

It measures the ratio between two values of power on the logarithmic scale.

Question 12

Q

What is the Mel Frequency Scale?

Answer

A

It is a perceptually motivated frequency scale based on the human auditory system’s response to sound.

Question 13

Q

What is a phone and what is a phoneme?

Answer

A

A phone is a unit of sound produced by the human vocal apparatus. A phoneme is a unit of sound that distinguishes a word from another (in a given language).

Question 14

Q

What does the IPA stand for?

Answer

A

It stands for the International Phonetic Alphabet. NOT to be confused with the NATO Phonetic Alphabet (which is alpha, beta, charlie etc.)

Question 15

Q

IPA has two subsets in this class. They are?

Answer

A

ARPAbet = General American English
SAMPA = European Portuguese, English English, and GA English

Question 16

Q

Transcribe the following words into British English using the IPA: dark, suit, greasy, wash, water

Answer

A

dark [dɑːk]
suit [sjuːt]
greasy [ˈɡriːzi]
wash [wɒʃ ]
water [ˈwɔːtə]

Question 17

Q

Transcribe the following words into American English using the IPA: dark, suit, greasy, wash, water

Answer

A

dark [dɑrk]
suit [sut]
greasy [ˈɡrisi]
wash [wɑʃ]
water [ˈwɔtər]

Question 18

Q

We classify speech sounds into 3 categories. They are?

Answer

A

Voicing, manner of articulation and place of articulation

Question 19

Q

What is the role of ‘voicing’ in classification of speech sounds?

Answer

A

A sound can be voiced or unvoiced:
Voiced sounds:
- “Zoo” - The ‘z’ sound in “zoo” is voiced.
- “Bag” - The ‘b’ sound in “bag” is voiced.

Voiceless sounds:
- “Cat” - The ‘c’ sound in “cat” is voiceless.
- “Pop” - The ‘p’ sound in “pop” is voiceless.

In each example, the voiced sounds involve vocal cord vibration, while the voiceless sounds do not.

Question 20

Q

What is the role of ‘manner of articulation’ in classification of speech sounds?

Answer

A

Consonants:
1. Stop: Consonants produced by completely obstructing the airflow and then releasing it abruptly. Stops are also known as plosives. Examples include /p/, /b/, /t/, /d/, /k/, and /g/.

Fricative: Consonants produced by creating a narrow opening in the vocal tract, causing the airflow to pass through with friction and resulting in a continuous sound. Examples include /f/, /v/, /s/, /z/, /sh/, and /zh/ (as in “measure”).
Lateral: Consonants produced by creating a partial closure in the vocal tract, allowing the airflow to pass along the sides of the tongue. The /l/ sound in English is a lateral consonant.
Nasal: Consonants produced by lowering the velum, allowing the sound to pass through the nasal cavity. Examples include /m/, /n/, and /ng/ (as in “sing”).
Semi-vowel: Also known as a glide, semi-vowels function as consonants but have vowel-like qualities. They are produced with a relatively quick and smooth movement of the articulatory organs. Examples include /w/ (as in “well”) and /y/ (as in “yes”).

Vowels (in general):
Vowels are speech sounds produced with an open vocal tract, allowing the airflow to pass freely without significant constriction. They are characterized by the position of the tongue and lips. Vowels are described based on several dimensions:

Height: Describes the vertical position of the tongue in the mouth. It can be classified as high (/i/, /u/), mid (/e/, /o/), or low (/a/).
Backness: Describes the horizontal position of the highest point of the tongue. It can be classified as front (/i/, /e/), central (/ə/), or back (/u/, /o/).
Roundness: Describes whether the lips are rounded or unrounded during vowel production. It can be classified as rounded (/u/, /o/) or unrounded (/i/, /e/).

Question 21

Q

What is the role of ‘place of articulation’ in classification of speech sounds?

Answer

A

Consonants:
1. Labial: Consonants produced using the lips. Examples include /p/ and /b/ (bilabial), /f/ and /v/ (labiodental).

Dental: Consonants produced with the tongue tip placed against or near the upper teeth. An example is the “th” sound as in “thin” or “that” (/θ/ and /ð/).
Alveolar: Consonants produced with the tongue tip or blade against the alveolar ridge, located behind the upper front teeth. Examples include /t/, /d/, /s/, /z/, /n/, and /l/.
Palatal: Consonants produced with the tongue touching or approaching the hard palate. Examples include /j/ (as in “yes”) and the “sh” sound (/ʃ/) in “she”.
Velar: Consonants produced with the back of the tongue against the soft part of the roof of the mouth (velum). Examples include /k/, /g/, and /ŋ/ (as in “sing”).
Uvular: Consonants produced with the back of the tongue against the uvula. Examples include some sounds in languages like French and Arabic, such as the “r” sound in “Paris” (/ʁ/).
Glottal: Consonants produced at the level of the glottis, the opening between the vocal cords. An example is the glottal stop (/ʔ/) as in the Cockney English pronunciation of “butter.”

Vowels:
1. Front/Central/Back: Describes the position of the highest point of the tongue during vowel production. Front vowels (/i/, /e/, /ɛ/) are produced with the front of the tongue raised, central vowels (/ə/) with the tongue in a neutral position, and back vowels (/u/, /o/, /ɔ/) with the back of the tongue raised.

Close/Mid/Open: Describes the degree of openness of the mouth during vowel production. Close vowels (/i/, /u/) are produced with a relatively small degree of mouth opening, mid vowels (/e/, /o/) with a moderate degree, and open vowels (/a/, /ɑ/) with a wide degree of opening.
Roundness: Describes whether the lips are rounded or unrounded during vowel production. Rounded vowels (/u/, /o/, /ɔ/) involve lip rounding, while unrounded vowels (/i/, /e/, /ɑ/) do not.

These concepts help linguists classify and describe the articulatory features of speech sounds, contributing to our understanding of the phonetic and phonological systems of languages.

Question 22

Q

What is the place of articulation of the consonant ‘v’?
1. labial
2. dental
3. palatal
4. velar

Answer

A

labial

Think like this:
Labial for Lips
Dental for Teeth
Palatal for PalaceROOF (the roof of the tongue)
Velar for Valley (furthest back in mouth)

Question 23

Q

What are the 3 elements of ‘prosody’?

Answer

A

Rhythm (timing), stress and intonation

Question 24

Q

What is the role of prosodic boundaries?

Answer

A

They seperate prosodic units. They infer the rhythm of the a sentence. For example: ‘Mary | and John | went to the cinema ||.’

Question 25

Q

Summary question: Spoken language communication is the?

Answer

A

Exchanging information with words and sounds.

Question 26

Q

Summary question: Speech production is the?

Answer

A

Generation of speech sounds with the human body

Question 27

Q

Summary question: Speech perception is the?

Answer

A

The making of sense of the spoken language (understanding).

Question 28

Q

Summary question: Phonetics is the?

Answer

A

Classification of speech sounds.

Question 29

Q

Summary question: Prosody is the?

Answer

A

The rhythm, stress and intonation of speech.