language - speech perception Flashcards
components of the mental lexicon
syntax
phonological
semantics
orthographic
challenges for lexical access (6)
- speech is a continuous stream
- Homonyms - Bank (£) versus Bank (river)
- homophones - Aisle vs Isle
- Co-articulation
- Different Accents
- Invariance problem - Problems of definition of acoustic properties - phonemes, syllables, words
ambiguity in speech stream - word boundaries comedy example
Four candles or Fork handles
disambiguating the speech stream - how word boundaries are distinguished (4)
Categorical perception
- Ability to distinguish between sounds on a continuum based on voice onset times
Voice Onset Time
* Vocal cord vibration – VVVVa vs FFa
Perceptual Learning
- Adjust categorical perception based on sounds we hear
- Seems to be hard-wired - babies and primates can do this
Top-Down Processing
- e.g. “The state governors met with their respective legislatures convening in the capital city.”
- cough disguising one sound in “legislatures” - top down processing allows us to recognise the word, and not identify which sound was blocked by the cough
spreading activation
predictions of what may be coming up next via activation of items that are related to the acoustic input
e.g. apple –> appeal, apron, apollo, apply
lexical characteristics - speed of lexical access (3)
word length –> long words = slower to process
neighbourhood density –> lots of neighbours = slower to process
frequency –> more frequently accessed words in lexicon = quicker access
5 things lexical access is based on
- Bottom Up - Acoustic input
- Top down processing - disambiguating the speech stream
- Lexical characteristics
- Context
- Spreading activation that facilitates predictions
3 options for mechanics of lexical access
- gradually activate the word that matches the acoustic input
- activate all words that start with the same sound as the acoustic input and gradually de-activate words that no longer match the sounds
- gradually activate the word that matches the acoustic input more than other words
mechanism of lexical access - 1. gradual activation of the word that matches the sound
different sounds can build up to a word
like phonics - breaks down a word
as the different sounds in the word are produced we gradually find the correct word
e.g. a –> ape –> april –> apricot
mechanism of lexical access - 2. activate all words that sound like the start of the word, then gradually deactivate non-matches
as word is said, sounds are processed as the word is built up
e.g.
a –> a, ape, april, apricot
ape –> ape, april, apricot
april –> april, apricot
apricot –> apricot
mechanism of lexical access - 3. gradually activate word that matches acoustic input more than other words
words with the sound in it - regardless of where it is in the word, not just at the start of a word
e.g.
a –> ape, pay, say, april, apricot
ape –> ape, april, apricot
april –> april, apricot
apricot –> apricot
2 models of speech perception
Marslen-Wilson. (1987) –>The Cohort Model
access words in the lexicon via activation of all words sharing initial features and gradually de-activate words that stop matching the features (option 2)
Elman & McClelland. (1999) –> The TRACE model
features activate phonemes that activate words with a gradual increase in activation of words that match all features so that the word with the most activation wins (option 3)
cohort model - lexical activation
activation of cohort that match the input
e.g. “ap-“ –> apricot, apex, apple, apart, april
then gradual deactivation of items that fail to match input
e.g. “apri-“ –> april, apricot
then find a uniqueness point - only one word activated
e.g. “apric-“ –> apricot
items that do not match the word onset are not activated (e.g. cot, prickly - match other word sounds within the word apricot)
neighbourhood effects in cohort model
words that match the acoustic input compete for activation
e.g. apricot and aprikol
learning “aprikol” slows down recognition of the word “apricot” because of this
frequency effects in cohort model
high frequency = high resting states = less activation required to recognise high frequency words
apricot would be recognised more quickly than aprikol (lower frequency word)
evidence for cohort model - gating experiments
Warren & Marslen-Wilson 1987, 1988
participants are presented with fragments of words that gradually reveal the whole word and asked to guess what the word is after each presentation
example:
“john went to the zoo and saw a ca-“
= activate: cap, cat, caterpillar, camel, can, cannery, kangaroo
“…cam-“
= camel, can, cannery,
“…came-“
= camel
Grosjean (1980)
presented word “stretcher” - found same thing
how gating experiments support cohort model
recognition of a word is a gradual process that starts from word onset and continues until the end of the word
candidate words that no longer fit the acoustic input are eliminated
cohort model - structure
Marslen-Wilson & Warren (1994)
bottom up processing has priority
from speech input to lexical items (words)
via:
- facilitatory signals are sent to words that match the speech input
- inhibitory signals are sent to words that do not match the speech input
issue with cohort model - bottom up processing
priority given to bottom-up processing
doesn’t account for phoneme restoration effect - missing sounds are perceived by listeners
cohort model - 3 stages to word recognition
access
- acoustic-phonetic information is mapped onto lexical items
selection
- candidate words that mismatch the acoustic input are de-selected
integration
- semantic and syntactic properties of the word are integrated and checked against the sentence
cohort model - impact of context
sentence context does not influence the process of lexical access
lexical selection is based on activation of phonology and semantic information
integration is affected by sentence context ( only a very little bit )
context processing - priming paradigm
priming paradigm:
prime = doctor
target = nurse
semantically related words - spreading activation allows “nurse” to become active when “doctor” is presented
whereas if prime is “sheep”, “nurse” would not be activated
cross modal priming
Zwisterlood (1989)
prime = auditory
target = visual
can have related/unrelated prime-target pairs
e.g. captain –> ship, or captain –> wicket
faster reaction time for related than unrelated pairs
priming effect = difference between related and unrelated reaction times
also use a word fragment as a prime (ambiguous):
related = capt- –> ship
related = capt- –> slave
unrelated = capt- –> wicket
(capt- could be captain or captive)
priming effect seen here too
cross modal priming - with context
could have a neutral priming sentence:
“The men stood around for a while and watched their capt-“
or with context (bias):
“The men had spent many years serving under their capt-“
priming effect seen in both neutral and biased priming sentence with word fragments - ship and slave both activated with “capt-“ fragment
not seen on biased sentence with full word “captain” - only ship activated, not slave
items that match acoustic input but do not match sentence context are activated
items that match acoustic input but do not match sentence context are deactivated once the word is selected
supports cohort model - about acoustic input and how words in cohort are deactivated once word is selected (only when whole word is shown)
revised cohort model
Marslen-Wilson & Warren, 1994
Context influences selection/integration of word into sentence
The word with semantic activation that fits the context of the sentence will be integrated into the sentence.
The men had served for many years under their /capt/
Semantic representation of captain is a better fit to the sentence than the semantic representation of captive and helps to single out ‘captain’ as the appropriate word
debate over context - zwisterlood vs cohort model
Zwitserlood - context effects were observed before the acoustic input provided enough information to disambiguate the prime word.
The Cohort model predicts that context only affects integration of the word into the sentence
Zwitserlood argued that context affects word selection
summary of cohort model
- Speech perception is based on matching acoustic input to stored representations of words in the lexicon
- Words are recognised via a competitive process that activates a word ‘cohort’
- Cohort candidates do not actively engage with each other
- Words are identified when they reach their uniqueness point
- Cohort candidates that do not match acoustic input are eliminated
- Context does not constrain activation of initial cohorts but allows for rapid elimination of candidates that do not match sentence context
TRACE model
McClelland & Elman, 1986
words are recognized “incrementally by slowly ramping up the activation of the correct units at the phoneme and word levels”
lexical activation in the TRACE model
“ap” –> apricot, tape, apex, shape, apart, apple, april
“apri” –> apricot, april, apple
“apric” –> apricot
these words are just activated more than others - not deactivation of other words
(other words aren’t deactivated as such, they weren’t totally activated in the first place)
lexical competitive inhibition
as more sounds are given, other words are activated
e.g. “apri” –> apricot, april
apric –> apricot, prickly
TRACE model - connectionist principles
Processing units (nodes) correspond to mental representations of:
- features (voicing, manner of production),
- phonemes
- words
TRACE model structure
bottom up and top down
bottom up:
- each level is connected via facilitatory connections
- activation spreads up from features to lexical items
- features (acoustic-phonetic patterns) –> phonemes –> lexical items
top down:
- faciliatory connections between levels also travel down from the lexical level to the phoneme level and the feature level
- lexical items –> phonemes –> features
connections between nodes within each level are inhibitory
e.g. /v/ vs /b/ vs /d/ at the phoneme level are distinguished between
TRACE model - process example of word “van”
trying to perceive “van” - bottom up processing
FEATURE = phoneme activation - of /v/ /b/ /d/ (could be any of these
PHONEMES = then match to /v/ and inhibit the other potential sounds that no longer match
LEXICAL ITEM = remaining phonemes would be activated and in turn activate words that matched the input (vat, vamp, van, ban, can, pan)
then van inhibits these other options
top down can also occur - reinforces activation of nodes selected in previous levels
TRACE - radical activation model
any consistency between input and representation may result in some degree of activation
- nodes influence each other according to their activation levels & strengths of connections
- activation develops as a pattern of excitation from facilitation and inhibition
- candidate words are activated based on the pattern of activation
- bottom up and top down processes:
- bottom up - Activation from feature to word level
- top down – Activation from word to feature level
evidence for TRACE - activation of words in lexicon
Allopenna et al (1998)
eye tracking study
visual world paradigm
found words rhyme competitors (overlapping phonology, not same start as the speech input) are activated in speech perception
method:
- grid containing images of rhyming items, e.g. beaker, beetle, speaker, pram - and shapes
- task = click on the “beaker” and place it under the “triangle”
- monitored eye movements whilst they complete the task
- if words related to beaker are active in the lexicon participants will look towards those items
predictions:
- if TRACE = look at the beaker, beetle, and speaker (rhyme and onset)
- if cohort = look at beaker and beetle, not speaker (onset only)
results:
- look at beaker and beetle in first 400ms
- then beetle begins to drop
- also look at speaker between 400-600ms after word heard - as beetle drops
discussion:
- words that rhyme with sounds in any part of a word may become activated
- the initial Cohort of words activated in response to the speech stream is not limited to words with the same onset
top-down processing of speech perception - supporting evidence
Mirman et al (2008)
faciliatory links between words and phonemes should result in more accurate detection of phonemes in words compared to non-words
method:
participants asked to detect a /t/ or /k/ in words (e.g., highten) and non words (e.g., vinten) should find it easier to identify the /t/ in highten compared to vinten
results:
faster identification of /t/ and /k/ in words than non-words - participants expect them to be there
demonstrates top-down
top-down processing of speech perception - contradicting evidence (2)
phoneme detection –> participants can accurately detect phonemes in non-words that were word like
e.g., /t/ in vocabutary
participants failed to complete ambiguous phonemes with a phoneme that would create a word unless stimuli were degraded
e.g., identifying ‘sh’ as the final phoneme for the ‘fiss’
TRACE vs cohort model
top-down:
- TRACE emphasises top down processing
- Cohort model minimises the impact of top down processing
lexical access/activation:
- TRACE model accommodates the activate of rhyming competitors
- Cohort model predicts that lexical access is biased towards activation of words with shared onsets
context:
- TRACE model does not provide an account of how context might affect speech perception
- the evidence also suggests that there is a tendency to activate words that start with the same sounds
- cohort model only mentions context at the very end and only a tiny bit -
speech perception - which model is better according to Jusczyk and Luce (2002)
There is “now strong consensus in the field that activation-competition models of spoken word recognition—much in the spirit of the original Cohort theory— best capture the fundamental principles of spoken word recognition”