Lecture 23 Flashcards
Describe formants and sound spectrograms
- The formant with the lowest frequency is
called the first formant (F1), the second
formant (F2) is the next highest, etc. - These can be visualized using sound
spectrograms
Describe consonants
Consonants are produced by
constrictions of the vocal tract
Describe formant transitions
rapid changes
in frequency preceding or following
consonants
Describe phonemes
any of the perceptually distinct units of sound in a specified
language that distinguish one word from another, for example p, b, d, and t in
the English words pad, pat, bad, and bat.
Describe morphemes
a meaningful morphological unit of a language that cannot be
further divided (e.g. the word ‘dog’ cannot be broken down any more than it
already is)
Describe lack of invariance
While words are ‘built’ by putting together phonemes in different
combinations, the acoustic signal produced for any given phoneme is variable
Why is the lack of invariance important?
perceptual constancies (e.g.
colour constancy, size constancy, etc.), our perceptual systems can still
recognize differing acoustic signals as representing the same phoneme
Describe coarticulation
- The sounds produced by a single phoneme can be different depending on what phoneme
comes before and after it
Describe categorical perception
- Categorical perception occurs
with speech, given that a wide
range of acoustic cues results in
the perception of a limited
number of sound categories
Describe voice onset time
the delay between
when a speech sound begins and when the vocal cords start vibrating
Describe the McGurk effect
speech perception can be influenced
by multimodal integration
- Visual information provided is one
such influence (and can be referred
to as ‘audiovisual speech
perception’) - The McGurk Effect was introduced
in our very first lecture and involves
visual input changing speech
perception
Describe Kriegstein et. al study
- Kriegstein (2005) presented participants with
stimuli using both familiar and unfamiliar voices,
while using fMRI - The superior temporal sulcus (STS) was found
to be activated for all speech stimuli (consistent
with prior work associating it with speech
perception), though familiar (but not unfamiliar)
voices also activated the fusiform face area
(FFA) - Provides a physiological basis for a link between
speech perception and facial processing
Describe phoneme perception
- Phonemes are more easily perceived when
they appear in a meaningful context - Represents an influence of top-down
perception - Rubin et al. (1976): Participants recognized
phonemes more quickly when presented as
part of real words, as compared to ‘nonsense’
words (e.g. bat vs. baf)
Describe the phonemic restoration effect
- Missing phonemes can also be ‘filled in’ based on expectations (the
phonemic restoration effect) - Warren (1970): embedded a cough in a recording of a sentence and asked
participants to report: 1: where in the sentence the cough occurred, and 2:
where any phonemes were missing - Participants could not accurately place where the cough occurred, nor
recognize the missing phoneme that was removed
Describe the Millard and Isard study
- Miller and Isard (1963) asked participants to ‘shadow’ (listen with
headphones and repeat aloud what is heard) three kinds of sentences:
1. Grammatically correct sentences
e.g. gadgets simplify work around the house
2. Anomalous sentences that correctly follow grammatical rules but do not
make sense
e.g. gadgets kill passengers from the eyes
3. Ungrammatical strings of words
e.g. between gadgets highways passengers the steal
Describe the segmentation problem
- The segmentation problem refers to the fact that, because there are no physical
breaks in the continuous acoustic signal, speech segmentation (perceiving
individual words) can be a challenge
Describe transitional probability
The chance that one sound will follow another in a language
How do we learn a language?
- A (very) general example of this would be that English speakers expect at
least one vowel to occur after every few consonants or so - We learn these associations implicitly, through statistical learning
Describe the Davis study
- Davis et al. (2005) used noice-vocoded speech (a
method to add noise to an acoustic signal) and asked
participants to identity the words they were perceiving in
each sentence - Accuracy was close to 0% for the first sentence and
gradually increases across sentence number - Additional information provided by the preceding
sentences provides some context and can lead to pop-
out effects, in which related words in later sentences are
easier to identify - Speaks to the role of top-down processing in perceiving
language
Describe motor theory
- Although it has largely fallen ‘out of fashion’, Liberman et al. (1963, 1967) proposed a motor
theory of speech perception - This was partly developed as a response to the lack of invariance (or, in simpler terms, the
presence of variability!) associated with phonemes contained with the acoustic signal - Remember that mouth movements involve changing the configuration of your articulators
(e.g. tongues, lips, etc.), which modify the shape of the vocal tract and therefore change its
resonance properties (which affect frequencies, etc.), which accomplishes the task of
producing different sounds of speech when air is pushed through it
What is important about model networks?
- Note that some models consider networks supporting production and
comprehension as separate (diagram on the left), others as a unified
network (diagram on the right) (Schomers & Pulvermuller, 2016)
Describe Broca’s aphasia
Broca’s aphasia results from damage to Broca’s area (in the frontal lobe)
Describe Wernicke’s aphasia
results from damage in Wernicke’s area (in the temporal lobe)
What is word deafness?
(inability to recognize
words)
Describe the voice area
A ‘voice area’ in the superior temporal sulcus (STS) has been identified that
is activated more strongly by voices than other sounds
Describe voice cells
‘Voice cells’ in the temporal lobe of monkeys have also been found which
respond more strongly to recordings of monkey calls than to calls of other
animals (Perrodin et al., 2011)
Describe phonetic features
- Some of the neurons identified in Mesgarani et al. (2014)
also seem tuned to selectively respond to more general
phonetic features, such as: - Manner of articulation: what you actually do with your articulators (how you move your tongue, lips, etc. when pronouncing certain phonemes)
- Place of articulation: where in your mouth the
articulators are manipulated (back of the throat, front of mouth close to teeth, etc.)
Describe the dual stream model of speech perception
- The dual stream model of speech perception proposes that the ventral and
dorsal pathways are involved with identifying sounds of speech and
representing movements associated with sounds of sounds, respectively
Describe Eimas study
- Eimas et al. (1971): tested habituation
with different VOT’s using suckling
(rather than looking) time - After habituating to a baseline
phoneme, the infants dishabituate to
one kind of change (that adults would
perceive as being a different
phoneme) but not another (that
adults would not perceive as being a
different phoneme)
Describe the social gating hypothesis
The social gating hypothesis proposes that
our brain ‘gates’ specific mechanisms that are
important/required for speaking particular
languages