Définitions Flashcards

Question 1

Q

Define the notion of spontaneous speech and mention the names of seven sources for collecting spontaneous speech corpora.

Answer

A

Refers to any naturally occurring discourse in a social context in which the participants freely choose their words (vocabulary) and syntax.

Collected from the following sources:
- interview
- dialog/face to face conversation
- self talk
- storytelling
- jokes, the retelling of dreams etc as other types of spontaneous speech
- Sports reporting in radio/television broadcasts
- movie and theatre language

Question 2

Q

Explain what speech science is. What fields are related to speech science?

Answer

A

Speech science is the name of a general field of scientific research that investigates human speech in terms of physical processes involved in its production and perception: hence it is divided into study of speech production and speech perception.
The fields related to speech science are the following: child language acquisition, teaching speech, the study of pathological speech and speech disorders, speech synthesis, speech processing, speech technology, speech recognition, voice identification, speech-to-speech translation. All these fields benefit from the contributions of acoustic phonetics.

Question 3

Q

Explain what we study in acoustic phonetics.

Answer

A

In acoustic phonetics we study how the speech sounds are formed acoustically in terms of the acoustic manifestation of the linguistics features occurring in speech, and we investigate their relationship from the percepectives of their production and perception mechanisms.

Question 4

Q

Explain what is meant by the term speech corpus

Answer

A

A speech corpus is a database of speech consisting of speech audio files usually with corresponding written transcription of texts.

Question 5

Q

Explain the differences between read aloud speech and spontaneous speech.

Answer

A

Read aloud speech involves a more carefully articulated performance of the speech sounds and prosody during which the speaker tends to or has to be more formal = lacks the naturalness of spontaneous speech in terms of the natural articulation of speech sounds and prosody. The naturalness in speech includes the dimensions of speech rate variations of the utterance, the energy intensity variations of sounds throughout the utterance, as well as the hesitations and pauses or micro-pauses made by the speaker. A read aloud speech also lacks the context of situation in which natural spoken language is physically realized, as well as the speech style.

Question 6

Q

What are the sources of the corpora of read aloud speech?

Answer

A

1) - News broadcasts —
2) — News broadcasts and excerpts from books such as story reading are more suited for the supra-segmental analysis of speech than vowels and consonants.
3) Word lists: read-aloud isolated words and sentences in a phonetic laboratory — This type of corpus is the most preferred one for the acoustic analysis of the vowels and consonants of a language.

Question 7

Q

What is the objective of conducting a speech waveform analysis? What is required for a speech waveform analysis?

Answer

A

We conduct a waveform analysis to recognize certain elements/acoustic features of the physical composition of the spoken segment under analysis.
(1) the acoustic signal of the segment that has been recorded using a microphone, (2) a digital computer, and (3) speech acoustic software in order to derive the waveform from the signal and make it appear on the computer screen

Question 8

Q

Explain what is meant by the term voice source.

Answer

A

the term voice source of energy refers to the energy that is produced by the periodic pulsing of the vocal folds. This happens when the airflow enters from the lungs into the mouth while the vocal folds are vibrating

Question 9

Q

Explain what is meant by the term noise source.

Answer

A

The term noise source of energy refers to the energy of the airflow that is not pulsed by the vibrations of the vocal folds. The noise source involves random/irregular vibrations of the airflow

Question 10

Q

Explain what is meant by the term transient noise

Answer

A

This noise is produced in the vocal tract when there is an abrupt release of stopped air in the vocal tract. This noise is also called the noise burst. It is produced and perceived only in released stop consonants.

Question 11

Q

Explain what is meant by the term aspiration noise

Answer

A

This type of turbulence noise (of energy) is produced when the airflow is rapidly disturbed/modulated at the glottis
The aspiration noise is usually and perceived in the released aspirated stop consonants of English

Question 12

Q

Explain what is meant by the term spectrographic analysis? What acoustic parameters are studied in a spectrographic analysis?

Answer

A

The technique of analysis of sounds and sequences of sounds through their spectrograms is known as the spectrographic analysis. (1) frequency composition, (2) the intensity, and (3) the time dimension of the sound resonance

Question 13

Q

Explain what is meant by the term turbulence noise.

Answer

A

This type of turbulence noice (of energy) is produced when the airflow rapidly passes through a narrow constriction in the oral cavity. The friction noise is the sound source of all fricative consonants, whether they are phonetically voiced or voiceless.

Question 14

Q

What acoustic information can be provided in a waveform analysis? (Mention only four. Do not explain.)

Answer

A

Voicing feature in vowels and consonants
The voicing features in stop consonants occuring in syllable initial position in English word: positive VOT and zero VOT
Occlusion in stop and affricate consonants.
Stop burst of energy
Friction noise in voiceless fricative consonants and voiceless affricates: aperiodic waveform
Display of the energy intensity of sound segments in general.

Question 15

Q

What acoustic information is not observed in a speech waveform?

Answer

A

The information regarding (1) the frequency range of energy distribution and energy concentration in consonants, (2) the formants of vowels and sonorant consonants, (3) the formant transitions into or out of adjacent vowels, (4) the anti-formants in nasal and nasalized sounds, etc

Question 16

Q

What is the origin of the periodicity of a periodic speech waveform?

Answer

A

The origin of the periodicity of a complex periodic wave is the periodicity of the vibrations of the vocal folds, which is due to the periodic opening and closing movements of the glottis. These regular movements cause a train of successive vocal air pulses through which the airflow from the pharynx is propagated upward into the mouth in order to be filtered there and then radiated out. All vowels and voiced consonants have a periodic waveform if they are not whispered or not contextually devoiced.

Question 17

Q

Explain the mechanical and acoustic mechanisms of producing fricative consonants.

Answer

A

Fricatives as a class of consonants include those speech sounds of which the production involves the turbulence of the airflow in the vocal tract. In producing a fricative consonant, the airflow is forced through a very narrow channel causing a particular noise called the frication noise (also called the friction noise, the fricative noise, or the turbulence noise). This special mechanism is called frication. The narrow channel is formed through a constriction of the vocal tract. In the spectrogram of a fricative sound, the frication noise has a ‘salt and pepper’ appearance.

Question 18

Q

Explain the terms strident and non-strident fricatives

Answer

A

Traditionally, fricatives have been divided into sibilants [s, z, ʃ, ʒ] and non-sibilants [f, v, θ, ð, h]. Some phoneticians use the terms stridents and non-stridents instead of sibilants and Non sibilants. The term strident meaning ‘loud or harsh’ is associated with the idea of increased loudness. The increased loudness is correlated with greater noise intensity, which indicates more overall acoustic energy.

Question 19

Q

Explain the acoustic difference(s) between the sibilant and non-sibilant fricatives of English in terms of their waveforms and spectrograms.

Answer

A

The amplitude level of frication noise energy. This acoustic parameter is of great importance to the distinction between the two types of fricatives. On the basis of noise amplitude/intensity, sibilant fricatives are acoustically high-energy and non-sibilant fricatives are low-energy fricatives, as can be observed in their waveforms and spectrograms: sibilant fricatives have larger waveforms and darker spectrograms. 2. The duration of frication noise: sibilant fricatives normally have a longer frication noise. 3. Formant transitions: sibilant fricatives differ from non-sibilant fricatives also in terms of the transitions into and out of the adjacent vowels.

Question 20

Q

Explain the spectrographic difference between the alveolar sibilants [s, z] and the post-alveolar sibilants [ʃ, ʒ] of English.

Answer

A

The frequency regions of concentration of noise energy (in the form of dark bands or patches) for the alveolar [s, z] are higher than for the post-alveolar [ʃ, ʒ].

Question 21

Q

Explain the difference between phonetically voiced and voiceless fricatives from the perspective of their sound source(s) of energy (source-filter theoretical perspective).

Answer

A

The voiceless fricative consonants [f, θ, s, ʃ, h] are phonetically voiceless. This means that they do not have a voice source of energy (i.e., no vibration of the vocal folds). They have only frication noise as their sound source of energy. Thus, their waveforms aperiodic. And their spectrograms are not striated and lack a voice bar as the vocal folds are not vibrating during their production. The voiced fricatives [v, ð, z, ʒ] have a simultaneous voice source (voicing energy because the vocal folds are vibrating during production) and a frication noise source. The waveforms of these voiced fricatives tend to be partially, considerably, or entirely periodic depending on the phonetic context.

Question 22

Q

What are the acoustic features of the frication noise energy in fricatives?

Answer

A

The acoustic features of the frication noise energy in fricatives:

(1) Quasi-random distribution of the frication noise energy in fricative spectrograms The frication noise of a fricative has a quasi-random distribution of energy that is displayed in a spectrogram with a ‘salt and pepper’ appearance having a scribbly pattern.
(2) Duration of the noise generation in fricatives as opposed to affricates and stops. The frication noise of fricatives is longer than the frication noise of affricates.
(3) Gradual onset of noise energy Fricatives have a longer rise time than affricates.

Question 23

Q

Explain the mechanical and acoustic mechanisms of producing the affricate consonant sounds [ʧ] and [ʤ].

Answer

A

The production of affricate consonants mechanically involves a stop-like closure (i.e., complete closure in the vocal tract) and a fricative-like release (i.e., gradual release) of the airflow through the mouth. Hence their spectrograms somehow resemble both the stops and the fricatives. The two mechanisms of stop-like closure and fricative-like release of the airflow involved in producing affricates are indicated by means of the IPA phonetic symbols ʧ and ʤ. Each of these symbols represents a single phonetic-phonological unit and indicates a sequence of two mechanical/acoustic mechanisms (and not a sequence of two sounds) because an affricate sound shares the features of both the stop sound (its spectrogram displays an interval of closure) and the fricative sound.

Question 24

Q

Explain the acoustic differences between the affricates and fricatives of English.

Answer

A

(1) Affricates have a stop-like closure hence their waveforms and spectrograms show the presence of a silence gap, during which the airflow is blocked. Fricatives do not have this articulatory feature.
(2) The frication noise in affricates is shorter than that of fricatives.
(3) Affricates have a shorter rise-time than fricatives. The term rise time refers to “the rate of increase in noise intensity”. That is, the noise component reaches its maximum amplitude value more rapidly in affricates than in fricatives.

Question 25

Q

Explain the acoustic difference between the spectrograms of voiced and voiceless affricates.

Answer

A

The production of the voiceless affricate [ʧ] involves only the noise source whereas that of the voiced affricate [ʤ] involves simultaneous use of the voice source and the noise source. (1) The waveform of [ʧ] is aperiodic, whereas the waveform of [ʤ] is usually periodic. (2) The waveform of [ʧ] is larger than that of [ʤ]. (3) The noise component tends to be slightly longer in [ʧ]. (4) Moreover, the spectrogram of [ʧ] is darker, and not striated. (5) Presence of vertical striations and a voice bar in the spectrogram of the voiced affricate [ʤ]. The voiceless [ʧ] lacks these two features.

Question 26

Q

Explain the mechanical and acoustic mechanisms of producing nasal consonants (NCs).

Answer

A

Mechanical mechanisms :Nasal consonants (NCs) are produced through a combination of tongue or lip movement to create a closure point in the vocal tract, temporarily stopping airflow, and the lowering of the velum to allow air to flow through the nose. This coupling of the oral and nasal passages distinguishes NCs from oral stop consonants.
Acoustic Mechanism: The coupling of oral and nasal passages prevents pressure build-up in the mouth, eliminating a noise burst in NC production. It also lowers the amplitude of NCs, making their formants weaker compared to vowels. The resulting sound, known as a nasal murmur, involves acoustic energy radiating through the nose.

Question 27

Q

Explain the term nasal murmur.

Answer

A

A “nasal murmur” can be defined as the sound generated when two air passages are coupled, leading to the radiation of acoustic energy through the nose.

Question 28

Q

What are the acoustic features that cue the place of occlusion in NCs?

Answer

A

The place distinctions can be identified through the formant transitions into and out of adjacent vowels, particularly the F2 transition. The formant transitions constitute the principal cue for the place of occlusion (i.e., closure location) in these consonants. The oral and nasal stops have similar place cues: the formant transitions of the nasal bilabial [m] and the oral bilabials [p, b] are the same. We can also observe the same transition patterns for the nasal alveolar [n] and the oral alveolar stops [t, d] on one hand, and the nasal velar [ŋ] and the oral velar stops [k, g] on the other.

Question 29

Q

What is the acoustic feature that cues voicing energy in NCs?

Answer

A

Since the airflow is pulsed by the vibrations of the vocal folds before passing through the nose the NCs of English are (normally) phonetically voiced. The voicing energy can be observed from their periodic waveforms or from the presence of a voice bar near the bottom of their spectrograms or from the vertical striations in their spectrograms. The sound source of the NCs is the voice source. These sonorant consonants lack the noise source hence their spectrograms display a formant pattern. This spectrographic characteristic is shared by the other sonorant sounds, including the approximants and the vowels.

Question 30

Q

What are the acoustic features that cue the nasal manner of articulation?

Answer

A

1.Nasal murmur: The nasal murmur is an important manner cue for distinguishing the NCs from the other consonants. It displays a concentration of energy in the low frequency range: intense low frequency of the first formant, called N1.
2. Substantial damping of energy: The coupling of the two air passages strongly damps the energy intensity of the produced sound. Damping means the reduction of energy. We can observe the acoustic result of damping in the following events: (a) In a V-NC transition (that is to say, in the connection point between a vowel and an NC)we observe an abrupt change from the high level of energy for the F2 of the V to a zero or quite low level of energy for the NC. Such an abrupt change in the level of energy may be observed in an NCV transition, too. This is because of strong damping in the 800-2000 Hz region of the NC where the F2 is normally invisible in spectrograms. Damping introduces anti-resonances into the vocal tract, which are called anti-formants or zeros. The anti-formants absorb the acoustic energy in the vocal tract and consequently reduce the energy amplitude of the NC. The anti-formants appear in the form of horizontal white streaks in the spectrograms of NCs. The presence of horizontal white streaks in the 800-2000 Hz region of the spectrograms of NC as well as above the 3500 Hz region indicate that there is very little or no energy in those frequency regions.
3. Co-articulatory nasalization: The co-articulatory nasalization of the adjacent vowel as an important acoustico-perceptual cue to the nasal manner.