Unit 4 Flashcards
Consonant production misc.
A consonant is produced with a constricted or closed vocal tract
Most all consonants except for nasals and semi vowels- the sound that will distinguish one place of articulation from another is going to be aperiodic noise. As we change the place where that constriction occurs and the amount of, the sound of that aperiodic noise will take on different characteristics
The voiceless sound is generated by shoving air through a small place
Resonant consonants are periodic and extremely vowel like
Only vowels and some semi vowels can serve as the nucleus of the signal
Cannot have a syllable that is made up of just a consonant
Consonants sometimes can be a nucleus
Acoustic features of consonants
Creating a relatively closed vocal tract by constricting the tract somewhere along the tube
Air going from the large part of the tube is shoved down into a small opening in which the air speeds up and comes out the other side of that opening an as it hits the pressure on the other side, the air particles start to spin randomly (called eddies). The swirling is called turbulence (there is no periodicity to it)
The aperiodic sound is like noise (shhhh), the characteristics of the noise will be dependent on factors:
- Place of where the constriction takes place in the vocal tract
- How the degree is the space constricted (air pushed through a small pin hole or flowing like air under a door) ttttk (complete constriction, shhh/ fffhhh (does not)
- Duration, how long the air is being pushed through the constriction
- Voicing overlay, turning the voice off and on, gives us a combination of noise laying on top of each other
- Rise time: how rapidly you go from 0 amplitude to maximum amplitude of the consonant (plosives have short rise times p,t, b), fricatives have very long rise times (sh), affricates are in the middle
- Formant transitions: what consonants are being used?
Semi- vowels are not obstruents and are produced a little differently than obstruent (pressured) consonants
Nasals and semi-vowels are part consonants and part vowels
It is the pressure of consonants that are primarily responsible to figure out what is being said
By losing consonants by distorting them, you lose the meaning and intelligibility
Stop Gap
Stop gap occurs in initial positions, when you just start to create the p, and you can have a stop gap in the middle (buttercup) and the end
In order to produce the plosive you have to completely obstruct the flow of air
Stop Gaps in connected speech
Depending on which consonant your producing, your speaking rate and the sound that follows it, can vary from 50-150 msec
If it goes more than 150, it will sound distorted
You can create pulses from buildups of pressure
There is a voiced bar for raggedy because g is voiced and we leave the voiced on and in buttercup we turn the voiced off
We always take the path of least resistance
We voice voiceless sounds if it proceeds a vowel?
Stops: noise burst
Now we have just anterior to the point of constriction, we create eddies and because of the pressure built up we hear the eddies and with enough pressure they get big enough and you can actually hear it. It a sudden explosion of pressure as it is escaping constriction.
It can range from 5 to 40 msec.
On average, the duration of plosives is about 10 msec and the rise time to maximum amplitude is about 10 msec
The duration and rise time allows us to determine if it is a stop
Frequencies are determined where they are produced in the oral cavity.
Theres more energy between 500 to 1500 Hz for bilabials
If we push tongue back into alveolar potion, the primary energy is above 4000 Hz
If we go all the way back to velar, the spectrum comes down a little bit, the energy is between 1500- 4000
Stops: Voice onset time
Now we are going to take that plosive burst and combine it with another vowel
Since the vowel is voiced we will call it the voiced onset time: the time when the plosive bursts and the onset of the following vowel. Very critical feature in the speech of all languages. A primary acoustic cue for whether that plosive is voiced or voiceless. If the delay is between 40-80 msec long, youll hear it as p and say it’s a voiceless plosive. However if the time is very short between -10 and +20 msec, youll hear it as voiced b. You can have a -10 by beginning voiceless before you release. The slower you go, the longer they will be.
The VOT’s do develop and get better with age and the times become more precise.
Children are able to receive the acoustic timing relationships and are important in understanding the development of speech and language.
Children and adults who stutter have longer VOT
Lips are closed and the time building up pressure is stop gap
Maximum amplitude for frequencies between a specific range about 500 Hz
When you hear the maximum amplitude and the 10 ms duration your ear will know it’s a duration
Your training your ear to hear greater than 4,000
What plosives will they not hear if client has hl of above 3,000 Hz /tit/ vs /k/ /p/ [t and d have more energy above 4,000 hz]
Ca distinguish between p and b because different onset times. Voiceless 40-80 msec
Categorical perspective of VOT and Consonants
Brain processes and categorizes consonants and vowels differently
Consonants are supposed to be processed categorically but there are limits of their acoustic boundaries, you can blend from one category to another
Vowels processed continuously
stops: Aspiration during voice onset time
Description: Audible release of air between noise burst and following vowel.
VOT for aspirated consonant typically longer. Sounds “breathy”.
Voiced-voiceless cue
Formant Transitions and Consonant Perception: stops
Started vocal fold vibrating, will see the bending of the formants
Formant transitions provide a great deal of information about what the vowel and consonant is
Formant transition for formant 1 gives your brain information about the manner of production; stop. The formant will always start low and will usually curve up. More constriction: more starts at a lower point and will always be going up
Formant 2: gives information if its bilabial, starting point for f2 is always pointing to the spectral energy of that consonant. The blue circle is reflecting the peak energy, will it would be . For d, it has to go down and the degree of bend will depend on how far it has to move
primary acoustic cue for identifying place of articulation for non released plosive consonant
- formant 2 is bends toward the spectral energy for where the consonant would be
primary acoustic cue for the manner of articulation for syllable or word initial stop
- formant 1 starting from lower level moving up to position, terminal consonant: formant 1 would be high to low, pointing down to 0
Formants of plosives
Formants 2 and 3 always “bend” toward (or away from) the primary consonant energy to the respective formant positions for the following vowel.
F1 always bends from (CV) or towards (VC) 0 Hz
Stops: formant transitions in VC structure
The second formant will point towards the where the spectral energy where the following consonant is going to be
We hear pip without the last p because the formants are transitioning out
Sound is dynamic
Fricatives
Where you constrict the airflow in the oral cavity, you are changing the length of tube and will determine where the primary spectral energy is for that fricative
Manner of production: produced by creating aperiodic random airflow
If you hold air out for 130 msec, fricative type sound
Clusters: when you put it with another vowel or sound because of coartciulation relation will be much shorter, 50 msec
Final: 200 msec
Duration Average: 130 msec. Clusters (“flow”): 50 msec Phrase final (“bath): 200 msec. Rise Time approx. 76 msec.
Fricatives have to be in the range, lengthening it isn’t a primary marker but it has to be atleast 50 sec or it will sound like something else
If you change the length or change placement of time, you are manipulating spectral energy
Strident fricatives: those that have concentrated energy in a smaller frequency range, diffuse fricatives energy is spread out over a wider band
Spectral energy for english fricatives
More concentration of energy in the higher frequency bands
Th voiceless is a small amplifier so were gonna have our stridents (s,z,sh) will be produced with more precision and greater power and concentrated energy, concentrated through small place
Know bilabial, velar, and alveolar where the energy is concentrated (s is 4,000 hz)
Affricates
The manner of articulation identified by duration, affricate: between 75- 130 msec (ch, dg)
Placement for affricates are the same and same spectral characteristics, difference is duration and rise time (between plosive and fricative)
Spectral Energy = similar to “sh” (>2k)
Duration = 75130 msec
Rise Time = 33 msec (10 msec, 76 msec
Transitions 75-150 msec (stops = 50-75 msec)
Spectrograms = look like fricatives, but shorter
Nasals
Production
Occluded oral cavity
Split air flow and sound = anti-resonances
Voiced continuants
Consonants (degree of constriction)
Syllable nucleus = yes
Weak formants
Anti-resonances=weak formants
Nasal cavity damping = weak formants
The nasal and oral cavity both serve to resonant sound, you have the block mouth exit for nasal speech
You divide the sound into two columns, that creates a void so instead of having just resonances, by dividing the airstream in the nasal cavity, we also have an anti resonants- sucks the energy out of the resonants, suppression of certain harmonics
A nasal by its nature is a voiced continuant
Difference between ing and k is dropped the soft palate
All bones lined with mucus membrane, youre gonna get anti resonants, take the amplitude of the resonant sin the nasal cavity and then decrease it?
Also because of the nasal cavity and its absorption characterstics, it is going to create weaker formants, the bandwidths in nasals get wider
The lowest formant, going to be extrememly low, below 500 hz is what we hear as the murmer
Formant 1: very weak
Formant 2 and 3, the f2 transition will distinguish m,n,nj
Nasal emission: audible escape of air through a nasal cavity
Vowel coloring: where one sound takes on the charcateristics of another osund as you are coarticulating, nasals have a strong impact on the nasal characterisitcs of vowels, the vowels that come eofre the nasals with a nasa coloring to it
Nasals
Production
Occluded oral cavity
Split air flow and sound = anti-resonances
Voiced continuants
Consonants (degree of constriction)
Syllable nucleus = yes
Weak formants
Anti-resonances=weak formants
Nasal cavity damping = weak formants
The nasal and oral cavity both serve to resonant sound, you have the block mouth exit for nasal speech
You divide the sound into two columns, that creates a void so instead of having just resonances, by dividing the airstream in the nasal cavity, we also have an anti resonants- sucks the energy out of the resonants, suppression of certain harmonics
A nasal by its nature is a voiced continuant
Difference between ing and k is dropped the soft palate
All bones lined with mucus membrane, youre gonna get anti resonants, take the amplitude of the resonant sin the nasal cavity and then decrease it?
Also because of the nasal cavity and its absorption characterstics, it is going to create weaker formants, the bandwidths in nasals get wider
The lowest formant, going to be extrememly low, below 500 hz is what we hear as the murmer
Formant 1: very weak
Formant 2 and 3, the f2 transition will distinguish m,n,nj
Nasal emission: audible escape of air through a nasal cavity
Vowel coloring: where one sound takes on the charcateristics of another sound as you are coarticulating, nasals have a strong impact on the nasal characteristics of vowels, the vowels that come before the nasals with a nasa coloring to it
Acoustic features of consonants
Creating a relatively closed vocal tract by constricting the tract somewhere along the tube
Air going from the large part of the tube is shoved down into a small opening in which the air speeds up and comes out the other side of that opening an as it hits the pressure on the other side, the air particles start to spin randomly (called eddies). The swirling is called turbulence (there is no periodicity to it)
The aperiodic sound is like noise (shhhh), the characteristics of the noise will be dependent on factors:
- Place of where the constriction takes place in the vocal tract
- How the degree is the space constricted (air pushed through a small pin hole or flowing like air under a door) ttttk (complete constriction, shhh/ fffhhh (does not)
- Duration, how long the air is being pushed through the constriction
- Voicing overlay, turning the voice off and on, gives us a combination of noise laying on top of each other
- Rise time: how rapidly you go from 0 amplitude to maximum amplitude of the consonant (plosives have short rise times p,t, b), fricatives have very long rise times (sh), affricates are in the middle
- Formant transitions: what consonants are being used?
Semi- vowels are not obstruents and are produced a little differently than obstruent (pressured) consonants
Nasals and semi-vowels are part consonants and part vowels
It is the pressure of consonants that are primarily responsible to figure out what is being said
By losing consonants by distorting them, you lose the meaning and intelligibility