Final Flashcards
phonetics
the study of sounds made by human speech
6 approaches to study phonetics
- perception
- production
- developmental
- instrumental
- cultural
- historical
perception (approach to study phonetics)
auditory and visual input
ex: transcription of speech
production (approach to study phonetics)
respiratory, phonatory, articulatory, cognitive
ex: anatomy and physiology of speech organs
developmental (approach to study phonetics)
speech acquisition
ex: speech in infancy
instrumental (approach to study phonetics)
acoustical
ex: technology of speech analysis
cultural (approach to study phonetics)
sociolinguistics
ex: dialects
historical (approach to study phonetics)
evolution of speech and language
graphemes
written symbols
phonemes
speech sounds
isomorphism
a one-to-one correspondence between the members of two sets;
no isomorphism between graphemes and phonemes
articulators
parts of the vocal tract that contribute to the production of consonants and vowels
3 systems of speech production
- respiratory system
- laryngeal system
- supra-laryngeal system
respiratory system
power source for speech
lungs, trachea, bronchial tubes, rib cage, diaphragm
laryngeal system
sound source for speech
when vocal folds are closed it protects the trachea
(vocal folds, larynx)
supra-laryngeal system (vocal tract)
sound filter for speech, articulator system
nasal cavity, oral cavity, pharyngeal cavity, tongue
5 parameters for describing consonant production
- phonation
- place of articulation
- nasality
- secondary articulators
- manner of articulation
phonation
not an articulatory parameter
are the vocal folds vibrating? do we have phonation?
-if yes, consonant is voiced
-if no, consonant is unvoiced
vocal folds are apart (abducted)
no phonation
vocal folds are together (adducted)
phonation
place of articulation
where do the articulators touch or come closest to touching?
bilabial, labiodental, interdental, alveolar, palatal, velar
bilabial
upper and bottom lip touch
“p” “b” “m”
labiodental
upper teeth and lower lip touch
“f” “v”
interdental
upper teeth, lower teeth, and tongue touch
“θ” “ð”
alveolar
alveolar ridge and tongue touch
“t” “d” “n”
alveopalatal
back of alveolar ridge and blade of tongue touch
“ʃ” “dʒ” “tʃ”
palatal
hard palate and front of tongue touch
“j”
velar
velum and back of tongue touch
“g” “k”
nasality
what is the status of the velo-pharyngeal port?
- if velum is up (and port is closed), the sound is oral–air comes out of mouth
- if velum is down (and port is open), the sound is nasal–air comes out of nose
secondary articulators
tongue has to be used
how is the tongue positioned?
-if sides of the tongue are curled down, the sound is lateral (“l”); if not, it is central
-if the tip of the tongue is curled up and back, the sound is retroflex (“r”)
lateral
tongue tip touches alveolar ridge; tongue edges are curled down, allows air to flow along the sides of the tongue
manner of articulation
how close to touching are the articulators?
stops, flaps, taps, trills, fricatives, affricates, approximants
stops
stopage in airflow; airflow is obstructed
oral stop
airways out of the oral and nasal cavities are completely blocked
“b” “p” “t” “d” “k” “g”
nasal stop
airway is completely blocked in mouth and air comes out of nose
“m” “n”
glottal stop
air is stopped underneath vocal folds, then released
“ʔ” “ɾ” and coughs
flap
tongue tip quickly covers alveolar ridge
bu”tt”er
fricative
there is a narrow opening between articulators, creates turbulence or hissing sound
- sibilants: “s” “z”
- non-sibilants: “v” “f” “θ” “ð”
affricate
stop with a fricative release
“tʃ” “dʒ”
approximants
articulators are coming close together, but not as close as fricatives, to create sound
“w” “l” “r” “j”
degrees of occlusion
full occlusion = stop
less occlusion = fricative
least occlusion = approximant
3 parameters for vowel articulation
- jaw height
- tongue frontness-backness
- lip shape
jaw height
- if jaw is raised, it is called a closed or high vowel
- if jaw is dropped, it is called an open or low vowel
tongue frontness-backness
- if tongue is advanced, the vowel is called a front vowel
- if tongue is retracted, the vowel is called a back vowel
multisyllabic word (ə vs. ʌ)
- stressed syllable, use wedge [ʌ]
- unstressed syllable, use schwa [ə]
monosyllabic word (ə vs. ʌ)
- if the word is an open-set word, use wedge [ʌ]
- if the word is a close-set word, use schwa [ə]
open-set word
noun, verb, adjective, adverb
close-set word
conjunction, article, preposition, pronoun
diphthong
vowel sounds with a dynamic articulation that changes during the production of the vowel
prosodics
the suprasegmental characteristics of speech
speech segments
individual speech sounds; consonants and vowels
suprasegmental chracteristics
the qualities we give the segments of speech when we organize them into meaningful speech
2 sources of suprasegmental (prosodic) qualities
- stress
- intonation
stress
in words consisting of two or more syllables, one syllable will typically carry more stress than the rest
occurs at word level
intonation
at the level of the phrase; meaning can be conveyed by changes in intonation and/or stress
3 elements of speech production for intonation/stress
- loudness (intensity)
- pitch (rate of vocal fold vibration)
- length (duration)
yes/no questions
rising intonation
open-ended questions
falling intonation
diacritics
special symbol used to distinguish different qualities of a given sound or group of sounds
suprasegmental or articulatory qualities
laryngeal cartilages
cricoid, thyroid, arytenoids
thyroid cartilage
- articulates with cricoid
- consists of 2 plates that join at thyroid angle
- unpaired
- rocks and glides
cricoid cartilage
- sits on 1st (top) trachea ring
- unpaired
- has 4 facets (surfaces)
arytenoid cartilage
- articulates with cricoid
- glides and rotates
- sits on top of facets of cricoid
- paired
extrinsic laryngeal muscles
muscles that attach the larynx to other structures outside of the laryngeal structure
strap and mandibular
strap muscle
extrinsic muscle
lowers larynx; attaches larynx to sternum
mandibular muscle
extrinsic muscle
raises larynx
intrinsic laryngeal muscles
muscles that attach parts of the larynx to each other
cricothyroid muscle, thyroarytenoid muscle, LCA, PCA
crico-thyroid muscle
pulls thyroid cartilage down
thyro-arytenoid muscle
forms vocal folds
lateral crico-arytenoid muscle (LCA)
adductor–closes the vocal folds
posterior crico-arytenoid muscle (PCA)
abductor–opens the vocal folds
only abductor
pulls arytenoids in when contracted
oblique and transverse arytenoid muscle
connects two arytenoids together–adductor
- oblique runs like an “X”
- transverse runs like horizontal parallel lines
conditions for vocal fold vibration
- vocal folds must be properly positioned
- there must be sufficient airflow
2 conditions for proper position
vocal folds must be (1) adducted and
(2) sufficiently tense for phonation to occur
active muscular control
used to adduct and abduct the vocal folds
subglottal pressure
pressure below the adducted vocal folds
the open-and-close cycle of phonation is a result of _____________ and ____________
- aerodynamic forces
- elastic recoil forces
how are the vocal folds positioned during quiet breathing?
abducted
how are the vocal folds positioned during glottal frication?
partially abducted
how are the vocal folds positioned during breathy phonation?
partially adducted
how are the vocal folds positioned during normal phonation?
adducted
how are the vocal folds positioned during creaky phonation?
hyper-adducted
pitch
the aural perception of the rate of vocal fold vibration
fundamental frequency (F0)
- rate of vocal fold vibration
- # of open-close cycles per second, measured in Hz
- lowest or 1st harmonic
characteristic F0
a function of vocal fold mass
-larger (heavier) vocal folds will vibrate slower than smaller (lighter) vocal folds
cross-speaker difference
caused by individual differences in pitch
[ ̥]
devoiced
[ ̤]
breathy voice
[ ̰]
creaky voice
voice onset time (VOT)
the time at which voicing begins with respect to the release of stop closure
- VOT
- voicing has begun before the release of stop closure
- we are producing a voiced stop, which we perceive as [b, d, or g]
- if [b, d, or g] occur between two voiced sounds
0 VOT
- voicing begins at release of stop closure
- we are producing an unvoiced stop, which we may perceive as [b, d, or g] or [p, t, or k] depending on where in the utterance it occurs
- if [b, d, or g] occur at the beginning of an utterance or phrase
- if [p, t, or k] occur after an “s”
+ VOT
- voicing begins after the release of stop closure
- we are producing an unvoiced, aspirated stop, which we may perceive as [pʰ, tʰ, or kʰ]
- if [pʰ, tʰ, or kʰ] occur at the beginning of a syllable
2 principles of speech production
- maximum perceptual distinctiveness
2. maximum ease of production
physical properties of sound
frequency, amplitude, and duration
elastic medium
any medium in which the molecules, if disturbed, tend to return to their normal or resting state
necessary for sound
speed of sound (in air)
about 1147 ft/s, or 350 m/s @ STP
standard temperature and pressure (STP)
for AIR:
- 30 ̊ C
- pressure at sea level
- 0% humidity
frequency
how fast your vocal folds vibrate
- perceptually: pitch & overtones
- measured in Hz
amplitude
sound pressure or intensity
- perceptually: loudness
- measured in dB
duration
- perceptually: length of time a sound is sustained
- reported in seconds or miliseconds
sound wave
movement of air molecules
silence
air molecules at rest; the absence of sound
context for sound–>silence is important for speech production
aperiodic sounds
waves do no repeat
simple aperiodic sounds
- singularities or transients
- non speech: snap
- speech: burst release (clicks, plosives, oral stops)
complex aperiodic sounds
- ongoing, non-repeating
- non speech: noise (“shhh”), static
- speech: frication (fricatives, aspiration, breathiness)
periodic sounds
sound waves repeat
simple periodic sound
ongoing and repeating, composed of only ONE frequency
- non speech: pure tone (synthesized)
- speech: none
complex periodic sound
ongoing and repeating, composed of multiple frequencies
- non speech: sounds with pitch (mosquito buzz, bird song, cricket legs)
- speech: vocal fold vibration and overtones (voiced sounds and filtered sound qualities)
combination os sound types in speech
since phonation and articulation are independent, the sounds produced in the laryngeal system and those produced in the supralaryngeal system can occur in various combinations
(ex: VOT)
unvoiced stops [p, t, k]
silence
simple aperiodic
unvoiced, aspirated stops [pʰ, tʰ, kʰ]
silence
simple aperiodic
complex aperiodic
voiced stops [b, d, g]
complex periodic
simple aperiodic
voiced fricatives [v, ð, z, ʒ]
complex periodic
complex aperiodic
voiced affricate [dʒ]
complex periodic
simple aperiodic
complex aperiodic
metrics of sound measurement
waveform, spectrum, spectrogram, pitch contour, energy contour
waveform
a mathematical representation of alternating pulses of compressed and rarefied air
shown as a graph of amplitude (ordinate) and time (abscissa)
amplitude of waveform
correlates to loudness and can be measured as SPL or intensity level
time scale of waveform
measured in seconds or milliseconds
spectrum
a graph showing the amplitude of each component frequency of a complex periodic sound
a graph of pure tones, each having a different frequency and amplitude
-ordinate: amplitude
-abscissa: frequency
harmonics
pure tones
octave
doubling of frequency
how is a spectrum calculated?
by a Fourier analysis
Fourier analysis
a mathematical formula that provides a way to tease apart the elements of a complex periodic sound (the frequencies of the component pure tones and their amplitudes)
amplitude of the series of harmonics decreases at the rate of _____
12 dB/octave
3 systems of speech production
- respiratory system: power source
- phonatory system: sound source
- articulatory system: sound filter
source-filter theory
explains how the 3 systems of speech production work together to produce speech sounds
power source
the pulmonic (respiratory) system produces controlled expiration that produces speech also provides subglottal pressure that causes vocal folds to vibrate and produce phonation
sound source
powered by airflow from the respiratory system
the laryngeal system produces the vibrations that serve as the basis for voiced speech sounds
sound filter
the supralaryngeal system can open and close to let air out in greater and lesser quantities, producing vowels and consonants
occlusion
filtering
alteration of the sound
formants
sound waves produced in the vocal tract as air flows though and reverberates or resonates within various cavities
NOT produced
determined by vocal tract shape
harmonics-amplitude relationship
harmonics with frequencies closest to the formant frequencies will be amplified as they pass through the vocal tract
what is significant about the first 3-4 formants?
our ears are very sensitive to the first few formants because we use them to distinguish the different speech sounds
LPC curve
formula that uses the amplitude peaks of a sound’s spectrum and puts the predicted formant frequencies in a curve above
shows the formants of the vocal tract
what does a spectrum show us?
all the details of phonation and articulation, but no time dimension–> it is a sample of a sound at one instance in time
what information does a waveform provide?
allows us to interpret or calculate the sound source (periodic/aperiodic, burst/frication, voicing) and allows us to see how amplitude changes over time
tells us nothing about formants
spectrogram
a graph that represents time (abscissa), frequency (ordinate), and amplitude as a function of darkness on a grayscale
wideband spectrogram
highlights formants and tells us about articulation
vertical lines
narrowband spectrogram
highlights harmonics and tells us about phonation
horizontal lines
what does a wideband spectrogram show?
articulation, glottal pulses, formants (allows us to see how articulation is changing), harmonics (if speaker is shrill), and fundamental frequency
what does a narrowband spectrogram show?
phonation, formants (less clear), harmonics (see how phonation is changing), fundamental frequency
pitch contour
close-up view of a frequency
frequency scale is 0-350 Hz
women can have a maximum pitch contour of 250 Hz
what can we tell from an energy contour?
which sounds are loudest or longest, stressed syllable
acoustic correlates of vowels
- F1 and jaw height have an inverse relationship, the lower F1 is, the higher jaw height is
- higher F2 = more fronted tongue, lower F2 = more backed tongue
- low F3 = possible rhoticity [ɚ]
- lip rounding lowers formant frequencies
vowel space chart
- F1 is on ordinate, F2 is on abscissa
- both scales are in reverse order (high-to-low) to represent the tongue positions more intuitively
“schwa trick”
to determine formant frequency ranges for any given speaker, map that person’s mid-central vowel and use it as a frame of reference for the rest of the vowels
What are the average schwa values for adults?
F1 = 500 Hz F2 = 1500 Hz F3 = 2500 Hz
acoustic correlates of consonants: manner of voiced stops
- closure: abrupt drop in amplitude on waveform, sudden loss of sound tracings in frequencies above F0 on spectrogram
- complex periodic sound during stop gap on waveform and voicing bar in spectrogram
- release of closure with soft-to-moderate burst
acoustic correlates of consonants: place of voiced stops
(formant transition–especially F2 & F3–provide clues to place of articulation)
- F2 dips down from vowel for labial closure [b]
- F2 is level for tongue-front closure [d]
- F2 rises and F3 dips down (velar pinch) for dorsal closure [g]
acoustic correlates of consonants: manner of unvoiced, aspirated stops
- closure: abrupt drop in amplitude on waveform, sudden loss of all sound tracings on spectrogram
- no sound during stop gap–no voicing bar
- release of closure with moderate-to-loud burst
- aspiration following burst
acoustic correlates of consonants: place of unvoiced, aspirated stops
- formant transitions vary depending on which vowels and/or consonants are adjacent
- aspiration may obscure formant transitions at onset of vowel
acoustic correlates of consonants: manner of voiced and unvoiced fricatives
- more amplitude than stops
- waveform will show complex, aperiodic sound; spectrogram will show scratchy noise tracings
- voiced fricatives will show periodicity on waveform, and both voicing bar and glottal pulses on spectrogram
- voiced fricatives tend to be longer
acoustic correlates of consonants: place of voiced and unvoiced fricatives
- most energy is in the higher frequency ranges
- [s, z] and [ʃ, ʒ] tend to have louder frication (sibilants)
acoustic correlates of consonants: voiced and unvoiced affricates
-stop gap –> alveopalatal fricative offset
acoustic correlates of consonants: nasal stops
- abrupt onset and offset, but without a burst
- voicing bar + 1st nasal formant (N1) = nasal murmur
- more formants than oral sounds (often low intensity and not visible)
- antiformants (space between formants)
acoustic correlates of consonants: approximates (glides and liquids)
- [w] has high F1 & F2
- [r] has significant drop in F3 (rhoticity)
- [l] may show “step” transition
- [j] has low F1 & high F2
phonology
the study of the distinctive sounds and characteristic patterns of a spoken language
minimal pairs
two words, with different meanings, that sound identical except for ONE sound
broad transcription
a transcription that does not show a lot of detail usually just main symbols
narrow transcription
transcription that captures pronunciation in great detail using diacritics
impressionistic transcription
transcription of speech that is unknown to you (ie another language)
systemic transcription
a transcription that knowingly represents the regularities of a language’s unique phonology
ex: “dogs” –> [dɔgz]
phonemic transcription
a broad transcription of the underlying phonemes of a word; as it might be said in exaggerated citation-form speech, with no syllable carrying more stress than any other
phonetic transcription
transcription of the actual pronunciation of a word, the way it was said
phonological patterns at the segmental level
- VOT
- glottal substitution
- nasal and lateral plosion
- flapping: neutralization of d/t –> ɾ
- velarization of nasals
- nasalization of vowels in nasal contexts
- rhoticization of neutral vowels
- vowel lengthening
- vowel reduction
homorganic stop
sounds made in the same place of articulation
ex: [p] [b] & [m]
suprasegmental phonological patterns
- intonation (pitch & energy) contours
- stress patterns
- tonic syllables
nasals and lateral plosion
nasals and laterals following a homorganic stop may become plosive
ex: [sænd] vs [sæ.dn̩]
velarization of nasals
nasals that precede velars become velar
ex: [ræŋ] –> [ræŋk]
rhoticization of neutral vowels
when “r-coloring” occurs with a neutral vowel, the entire vowel becomes rhoticized
vowel reduction
vowels in unstressed syllables immediately adjacent to stressed syllables may be reduced
ex: demonstrate, [ɑ] –> [ə]
stress patterns
english marks the stressed syllable in a word or phrase by syllable lengthening, syllable loudness, and/or rising pitch (F0)