Speech comprehension Flashcards
How does the brain process what we hear?
As phonemes which combine to form words and then sentences
Phonemes don’t always directly correspond with the grapheme when reading
What is the first stage of speech comprehension?
Auditory analysis - the listener processes sound waves to identify phonemes
What are the next two stages of speech comprehension?
Phonemes accessed
Phonemes combined to form words, then sentences accessed beyond this
What is meant by phoneme perception being “categorical”?
We hear subtly different sounds as the same phoneme and cannot distinguish between them e.g. variations of the “B” sound can be buh, bee etc.
There is the same kind of perceptual boundary between phoneme categories as there is between subtle shades of the same colour
What does the McGurk effect demonstrate?
Our phoneme perception can be overridden by simultaneous visual input i.e. sound alone is not the only contributor to our hearing experience, there is rapid integration across multiple modalities
Why is speech a complicated perceptual experience?
It is a rapid and relatively uninterrupted flow of sound at up to 10 phonemes a second
Listeners have to be able to extract the sequence of informative lexical items from this stream (like fishing)
How do we perceive unfamiliar languages?
Perceived as being faster even though same speed
We are having to identify the individual words as well as figure out how they are fitting within the context (playing a game without knowing the rules) - need to try to separate out each word from within the continuous flow
In what ways can we make the processing of an unfamiliar language a bit easier?
Speaking more slowly
Varying pitch to alter emphasis
Enunciate more carefully
What is meant by a speech segmentation error?
We are hearing identical sounds and where we determine the segmentation alters word meaning –> can cause problems with misunderstanding
What is meant by cross-modal priming?
When hearing a word aids the recognition of a subsequent visual word in lexical decision making tasks
Hearing a semantically related word can reduce reaction times for visually presented words
What was Shillcock’s cross-modal priming experiment?
Illustrated priming for “ghost words” i.e. we can find priming effects for words that aren’t even being said
“New discovery” primes the word nudist - simultaneous and congruent –> faster reaction time compared to if “novel discovery” heard
What was an explanation for Shillcock’s results?
Hearing “new discovery” activates all possible words the phonemes could correspond to i.e. “nudist” has phonetic similarity so is temporarily activated in the mental lexicon
What did Shillcock suggest?
People are able to temporarily perceive embedded words that aren’t physically there but which are formed by phonemes within the speech stream
Rapid inhibition using context means we don’t consciously perceive the incorrect words activated (subliminal effect) and instead we identify the correct words
The embedded word is, however, active just long enough to prime the visual target
How can Shillcock’s process be summarised?
We generate continuous hypotheses about what we are hearing –> all possible words activated -> evidence accrues and some activations continue while others are dampened
What was Warren’s phoneme restoration effect?
We are constantly trying to separate speech from background noise
In Warren’s experiment, a phoneme in a spoken sentence was replaced with a cough but subjects were unaware of the missing sound - the brain uses context to fill in missing information so we actually hear the right word normally, regardless of degraded auditory input (similar to what done when recognising words)
What was Warren and Warren’s experiment into missing phonemes?
Subjects heard sentences with phoneme missing from critical word (e.g. *eel) –> makes word ambiguous
Sentence context disambiguated the word to the point that subjects understood the sentence and didn’t actually register that there was a sound missing
Why was Warren and Warren’s experiment so interesting?
Contextual information doesn’t come until the end of the sentence, suggesting that it’s influence is very RAPID to the point where we aren’t even aware it is happening
What did Samuals find regarding WHEN phoneme restoration occurs?
Investigated whether context influence is perceptual or post-perceptual (do listeners notice that a phoneme is missing, register any ambiguity?)
Cough sound player over or completely replaced phoneme
Found that lexical disambiguation i.e. words such as “legislature” directly affected perception i.e. couldn’t discriminate where sound missing or covered i.e. heard no error
Sentential disambiguation affects post-perceptual processes i.e. when sentence contains ambiguous word such as *eel
What was Elman and McClelland’s TRACE computer model?
Designed to mimic aspects of spoken word comprehension e.g. segmentation errors and lexical context effects
- Assumes early auditory processing of phonemes which are separated out (segmented) to form words
- Both bottom up and contextual (top-down) processing occurs
- Competition occurs at each level
- Present phonemes activate potential word candidates which then compete and the winner completes the missing information
What happens in the TRACE model when there are ambiguous phonemes in words such as “legislature”?
There is only one possibility so only one word is activated –> immediate feedback to phoneme level to fill missing sound
This happens so rapidly that it is a perceptual effect
What happens when we have a missing phoneme in a more ambiguous word?
May have to draw on additional information to help decision making e.g. sentential context
This is a slower process and so the effect is post-perceptual
How does TRACE “hear” embedded words?
Phonemes get processed one at a time by the computer and the system activates candidate words that are consistent with the current phoneme information
Candidates compete and the winner is selected while competitors are inhibited
But the ghost words show a level of activation up to a critical point