spoken word recognition (goslin lecture 2) Flashcards
what is meant by speech variability?
variablity in acoustic waveforms
what are some of the variabilities in acoustic waveforms?
-speaker rate
-Intonation
-Noise & distortion
-Accents
what are phones?
basic unit of sound in speech
what are characteristics of phones?
-A speech segment that possesses distinct physical or perceptual properties
-Acoustic or articulatory distinction (art looks at how the sound itself is produced through movement of the mouth)
Approximately 4000 available
what is a representation of speech production?
phones
what are allophones?
different phones that are perceptually equivalent in a language
e.g. /p/ in pin and sp/h/in - clear distinction that is lost through focusing on the sound differences useful to the langauge desired to speak
what are phonemes?
a set of phones that are cognitively equivalent
-phone for me
what are characteristics of phonemes?
basic unit that distinguishes words
-a change in a phone will produce a different word e.g. minimal pairs /p/ and /b/ distinguish pin & bin
-require perceptual distinction -> language specific and an abstract unit
-english has 44 phonemes:20 vowels, 24 consonants
how is speech picked up?
acoustics
specicially changes in sound frequency around a person
how are phones and phonemes infleunced by top-down effects?
a phone can be perceived as one of another phoneme depending upon context
-Ganong effect
-phoneme restoration effect
-McGurk effect, 1976
what is the ganong effect?
(Ganong, 1980)
categorical boundaries which are modulated by context
-e.g. voice onset time
-participants asked to pronounce word phrases, found that when voice onset time is longer for TA, as there is more pressure after pronouncing T, there is a longer gap between A than for DA
- perceptual system is actively trying to make sense of what is being said -> changes in the sound
what is meant by the phoneme restoration effect?
(Warren, 1970)
if there is a gap in an ambigious word, and a coughing sound is played when the gap is reached, ppts reported hearing /h/ or /p/ depending on the sentence primed with -> illusion of pre-conceptions of what should be hearing as medium is so loud and inneffective at communication
-effects stronger in words than non-words and in strongly biased contexts
what is the McGurk effect?
(McGurk & MacDonald, 1976)
the effects of visual speech perception on the audio stream has led to multimodal speech perception (use info from every sense to understand what we think someone will say -> adjust linguistic systems to what we predict people will say
-interference caused by seeing one phoneme and hearing another
what allows for active speech perception?
top down and bottom up information -> allows us to overcome imperfect speech input
what is segmentation?
continous speech signals(waveform caused by continous movements of articulators), when boundaries are not evident in the speech signal
we have to actively find the gaps within sentences as sound is costantly produced -> need to segment it
-Shillcock & Tabossi 1990
what did Shillcock & Tabossi, 1990 find?
-evidence of continous attempts at word segmentation of speech stream
-through cross-modal priming, one hear sentences but respond to written words on a lexical decision task
-given button to press, and task is to press when prime word is produced
what was the design of shillcock & tabossi 1990 experient?
primes = the scientist made a new discovery last year, the scientist made a novel discovery last year
target =** nudist primed by**: the scientist made a new discovery last year
new dis - guessing that t will follow the sequence, as it matches the sequence
what is the priming effect found in shillcock & tabossi 1990?
priming effect caused by temporary segmentation error, and there was no report of the perception of the word “nudist” in the prime -> evidence of continous segmentation attempts
how are words accessed in the lexicon?
what is the cohort model?
Marslen-Wilson & Welsh 1978
-picking words based of large cohorts to tell us what word we are expecting to hear
-access stage -> selection stage -> integration stage
what are charactersitics of word recognition?
-fast -> shadowing & word monitoring tasks found latencies of 250-275 msecs
-intuitively immediate - recognised before end of the word is reached
-evidence from gating, (grosjean 1980), presented with fragments of a word with gradually increasing duration
-speed and robustness depends on words in context -> system actively seeks matches to input
what is a lexical decision task used?
Press a button when a presented stimulus is a real word:
Words vs non-words
Spinach
Splinger
Fast response = easy access 400 ms
Slow response = hard access 500 ms
what affects lexical decision times?
-Word Length
-Word frequency
High frequency words = common words (“cat, mother, house”)
Low frequency words = uncommon words (“accordion, compass”)
Uniqueness point
-early uniqueness point = strawberry (there are no other English words beginning with ”strawb”
-late uniqueness point = blackberry (not unique at /b/ of berry; blackbird, blackbeetle,…)
-Neighbourhood
what are problems with the cohort model?
-not robust to distortion of initial phonemes
-ganong effect for intial, as well as non-intital phonemes
-lexical decision latencies are proportional to frequency-weighted neighborhood size
-requires segmentation before word identification can begin
what is the TRACE (interactive activation model)
-three sets of interconnected detectors: feature, phoneme and word detectors
-within a set connections are inhibitory
-between a set connections are excitatory
what did Luce et al 1990 find?
lexical activation in trace
-after a level of activation has been met, lick and lip carry on being activated over time but LAD is still a potential candidate.
what is evidence supporting the trace model?
-broadly compatible with lexical effects on phoneme identification, explaining them in terms of feedback from the lexical level to the phonemic level
-ganong effect & phonemic restoration effect
-trace recognises words even if intial phoneme is distorted
-can find word boundaries
what is a problem of the trace model?
requires massive duplication of units and connections, copying over and over again the connection patterns that determine which features activate which phonemes and which phonemes activate which words
is spoken word recognition an active process?
yes, observed through the ganong effect, phonemic restoration effect, the McGurkin effect and segmentation, lexical access (cohort & trace)