Introduction to speech and word recognition Flashcards
Speech segmentation
the process of breaking continuous speech into individual words or meaningful units
Challenges in Speech Segmentation
A lack of clear pauses in speech.
Homophones and ambiguity - words can sound the same but have different meanings.
Variability in speech
How does speech vary in speakers?
- Variability in voice - accent, speech rate
- Variability in how clear the speech is - outside noise
- Variability in how carefully words are pronounced; some speakers are lazy and won’t say the word in full.
- Variability in how phonemes sound around words.
assimilation
Variability in how phonemes sound. They differ on the surrounding phonemes.
How do we segment words?
- using pauses
- stress patterns
- phonotactics
- prosodic cues
stress patterns
In English the beginning of words are often emphasised / stressed
phonotactics
some speech sounds can only occur at some parts within a word. E.g., /nd/ is allowed at end (‘end’), but not in onset (‘nde’ is illegal).
prosodic cues
syllables at the start of the word are longer than medial syllables, the length of initial syllable influenced by world length ( longer when part of a longer word)
Stages of spoken word recognition
Activation ~ selection ~ integration
what are the two pathways of lexical activation?
serial or parallel processing
describe serial processing of language
Is it after we have heard or seen the whole word? ~ serial processing, find exact match then retrieve meaning
describe parallel processing
Do we try guess the word as quickly as we can? ~ parallel processing, Identify first sounds/letters and look for (partial) matches, Modify shortlist as more input comes in, incremental multiple process at once.
evaluate serial
+ Accuracy
- slow and word endings aren’t marked clearly when to stop?Accuracy
evaluate parallel
+ Fast and don’t have to wait for ‘snow’ when you hear ‘sn’
- You might commit to wrong word and then revision would be needed
is it more likely we use serial or parallel processing of language?
parallel
What happens during the Activation stage?
Recognize phonemes (speech sounds)
Activate possible lexical candidates (words that match the sounds so far)
What happens during the Selection stage?
Choose the best-fitting word from the activated candidates
What happens during the Integration stage?
Use stored knowledge about the chosen word
Integrate it with sentence context for meaning
What is the overall process of spoken word recognition?
Spoken input → Recognize phonemes → Activate word candidates → Select the best match → Integrate into sentence meaning
what are gating studies?
Gating studies are a research method used to investigate how listeners recognize spoken words over time as they hear more of the speech signal.
how do gating studies work
Participants hear partial word fragments (e.g., “c-“, “ca-“, “cam-“).
After each fragment, they guess the word.
Researchers measure when the word is first recognized.
What are the two key measures in gating studies?
Isolation Point – The moment a listener first correctly identifies the word.
Recognition Point – The moment the listener is fully confident in their guess.
What do gating studies reveal about word recognition?
Words can be recognized before they are fully heard.
Context speeds up recognition.
Frequent words are recognized faster than rare ones.
Words with many similar-sounding competitors take longer to recognize.
How does context affect gating studies?
Words in sentences are recognized earlier than isolated words because listeners use context clues to predict meaning.