Week 10 pt.1 : Speech perception Flashcards
Vocal tract path
- Trachea = air that produces the sound passes through the trachea (or windpipe)
- Larynx = next into the larynx (voice box)… vocal folds within larynx open & close to change pitch of the sound… however, parts of the vocal tract higher up are necessary for consonant sounds
- Pharynx = next into pharynx
- Uvula = closing the uvula prevents sounds from going up through the nasal cavity, affecting the quality of sound
- Mouth and nose… Where teeth, tongue, lips & uvula can affect the sound signal
speech is…
very complex & variable over time… many different moving parts come together to produce the intricacies of this signal… the air is provided through exhalation from the lungs
air passage…
- changes in density of air are caused by vibrations of the vocal folds/cords
- vocal folds… lay horizontally along the larynx & vibrate between an open & closed position to produce sound (rate at which they vibrate determines pitch)
- air them goes through spaces & articulators… they function together to shape sound wave (oral cavity, soft palate, tongue, lips & teeth)
Cocktail party effect
- listening to speech of one person but attention is distracted by hearing someone say your name from across the room
- this suggests that our attentional mechanism does not screen out all perceptual inputs even when we are focused on one conversation
vowels
- produced by unrestricted airflow from through the pharynx and the mouth & by vibrations of the vocal cords
- changing the shape of the mouth makes different vowel sounds
- each vowel sound has a characteristic pattern of harmonics…
in vowels, the harmonics that are the highest amplitude are known as…
- formants (peaks in the sound distributions)
- formants are the frequency bands with higher amplitudes among the harmonics of a vowel sound
- each vowel sound has a specific pattern of formants
- fundamental frequency is the same for each vowel & only distinguishable by formant frequencies
What can you do with the location of the first and second vowel formants?
- scientists can plot a talkers vowel space using the 2 vowel formants across speech sounds
- the axes are these 2 formant frequencies and each circle shows the bounds of a given vowel sound
- there is very little overlap between the representations & this highlights articulatory systems ability to function very will
Consonants
- produced by restricting the airflow in one place or another along the way up from the larynx
3 physical features important in determining the sound of a consonant…
- place of articulation… the point along the vocal tract at which air is constricted (e.g. tongue, lips teeth, hard/soft palette)
- manner of articulation… how the restriction occurred (e.g. lips pushed together/tongue at front/back of mouth etc.)
- voicing… whether the vocal cords are vibrating or not (‘b’ voiced consonant & ‘p’ unvoiced consonant)
mouth shape & consonant sound…
- p/b sound = passage of air stopped briefly via soft palate & press lips together then let out a burst of air
- t/d sound = air stopped briefly by putting tongue against alveolar ridge behind teeth
- f/v sound = produced by placing upper teeth against lower lip
- k/g sound = air topped by pressing back of tongue up against soft palate
Voicing onset time
- within a subset of consonants like t/d… the manner of articulation are the same for both
- so, to distinguish we add an additional cue - voicing onset timing
- the time at which the syllable transitions from the consonant sound to the vowel sound in a sound like ‘ta’ or ‘da’ (‘ta’ later cuz t is not a voiced consonant)
Phonemes
- basic units of sound in human language
- International Phonetic Alphabet - all phonemes used in human language
- English has 15 vowel sounds and 24 consonant sounds
Coarticulation
- the shape of the vocal tract when producing a specific consonant differs depending on what sound is going to follow it (‘ba’ vs ‘bo’)
- this is coarticulation
- the frequency content of syllables change as a function of the vowel sound in sound spectrograms
- we can feel these differences in our mouths & tongues when we say them
- as perceivers we do not hear the differences in the sounds & this is an example of auditory constancy
Categorial perception
- refers to our perception of different acoustic stimuli as being identical syllables up to a point at which our perception suddenly shifts to perceive another
- we do not hear variation in the sound, we only hear one phoneme then the other
- Categorial perception translates a wide range of voicing into one phoneme
Word boundaries
- one of the biggest challenges to speech perception & language learning is determining word boundaries in speech
- within a stream of speech, definitive word boundaries do not exist
- it may be very clear to us where boundaries belong but only in our own language and has little to do with the actual physical parameters of the speech sound itself
- top-down word segmentation
how are the word boundaries learned?
- thru exposure & attention to environment
- much learning happens very early in life
- motherese may support language learning… adults emphasize frequency contours & word boundaries when speaking to infants which might help them learn it
discerning word boundaries
- syllable transitions occurring within words are encountered more often than those that occur between words through everyday language
- learning theories suggest we are able to pick up on these differences in regulatory & learn that low likelihood transition points likely compromise the boundaries between words
Phonemic restoration effect
- top-down processing of what one expects to hear overrides input from the cochlea
- very strong subjective effect
- context clearly indicating a blank word in a sentence but the person actually hears the word being said even tho its not
- works even when the context of the missing sound occurs after the missing sound… so much occur at a pre attention stage in speech perception processing
- areas in the auditory cortex involved & prefrontal lobe
General mechanism theories of speech perception
- speech no different than any other sound & we use same mechanisms
- only thing that makes speech important is its importance which is learned
Special-mechanism theories of speech perception
- because of the importance of language to humans, special mechanisms have evolved that are specific to speech & not used in other sound processing
- evidence comes from mcgurk effect e.g.
development of phoneme perception
- babies have ability to understand all phonemes and language
- by the time they are 10 months old they started showing perceptual narrowing… regularly experienced phonemes are honed in on & simultaneously diminish ability to discriminate unfamiliar phonemes
audio-visual speech benefits
- our perception of speech is dramatically affected by the combination of auditory & visual speech cues
- auditory & visual cues combine in our brains to generate the best estimate possible of our environment
- articulators make different movements to make different sounds & this provides useful info when trying to disambiguate unclear signals (e.g. speech in noisy room)
- there is a lot of ambiguity when we only have sound info
McGurk…
- clearest example of how visual cues influence the perception of speech comes from an experiment reported by McDonald & McGurk
- video of someone saying sounds like ‘ba’, ‘da’ and ‘the’ but the audio doesn’t always match what the speakers mouth is acc saying
- but participants perceive the sounds as being different from what they acc hear and what phoneme the mouth makes the they’re watching
speech in the brain…
- largely isolated to left hemisphere
- broca’s area, wernicke’s area & angular gyrus