speech perception Flashcards

(45 cards)

1
Q

challenges of speech perception

A
  • no clear gaps between words
  • co-articulation: acoustic realisation on speech depends on what you’ve just said and what you are about to say > the same words can come out differently each time

also pronunciation varies from speaker to speaker, accents, etc

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

how do we produce speech

A
  • lungs push air up the trachea
  • which vibrate the vocal cords in the larynx
  • sounds from the vocal cords are then shaped by the supraryngeal vocal tract
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

labial consonants

A

lips used/touch

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

alveolar consonants

A

tongue touches behind teeth

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

velar consonants

A

tongue toches back of mount

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

stop

A

air flow stops completely

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

voice

A

when you say them vocal cords vibrate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

unvoiced

A

no vibration

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

fricative

A

constriction does not happen completely, friction involved

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

nasal

A

airflow redirected to nasal cavity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

sound waves

A
  • periodic displacement of air molecules, creating increases and decreases in air pressure
  • when we plot changes of sound pressure over time
  • molecules come closer or further apart - inc and decr pressure

–> forming waveforms

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

spectograms

A

split sound into different frequencies at each moment

amplitude: indicated by colour

  • splits info into different frequency channels - depicts info that the brain gets
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

source and filter theory

A

source: the vibrations of the vocal cords

filter: superlaryngeal vocal tract structure that shapes the sound produced by the source

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

source only

A

can maybe interpret whether sound is a question? or a statement.

the gender, happy or sad

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

source AND filter

A
  • intelligible speech
  • filter (supralaryngeal vocal tract, lips, teeth) important for sounds - PHONEMES
  • filtering appears as band of energy at certain frequencies in spectrograms Called FORMANTS
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

lowest three formants

A

F1 F2 F3

these are important cues for identifying vowels

  • brain can know which vowel it is hearing by detecting these auditory CUES
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

formants for vowels

A

F1 F2 F3

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

formants for consonants

A

F2 F3

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

CATERGORICAL PERCEPTION
DEMONSTRATION

A

demonstrated:
continuum of sounds ‘ba’

one end: one sound
other end: another sound ‘da’

middle: sound that is ambiguous between the two cues

  • task: where they heard each sound

1st signature of categorical perception = PHONEME BOUNDARY - where ps are equally likely to respond ba as da

20
Q

1st signature of categorical perception

A

Phoneme boundary: where participants are equally likely to respond ‘ba’ than ‘da’

21
Q

2nd signature of categorical perception

A
  • discrimination peak near the phoning boundary
22
Q

CATEGORICAL PERCEPTION

A

the tendency to perceive gradual sensory changes in a discrete fashion

23
Q

3 hallmarks of categorical perception

A
  1. abrupt change in identification at phoneme boundary
  2. discrimination peak at phoneme boundary
  3. discrimination predicted from identification (only sound ‘different’ if different phoneme
24
Q

context affects

A
  • speech perception depends on prior knowledge and contexts

‘McGurk effect’: lipreading with different sound - what we hear is changed by what we see

25
McGurk effect
'McGurk effect': lipreading with different sound - what we hear is changed by what we see
26
Ganong effect
- continuum paradigm - use the same sound but tell people its 'giss' to 'kiss' or 'gift' to ' kift' bias to k and g changes for the one that is the real word
27
motor theory of speech perception LIBERMAN Component 1
- speech perception = result of specialised speech module that operates separately from the mechanisms involved in perceived non-speech sounds AND is UNIQUELY HUMAN EVIDENCE: Speech and not other sounds are perceived categorically e.g. yanny OR laurel - not both basically be proven wrong
28
motor theory of speech perception LIBERMAN Component 2
- The objects of speech perception are intended articulatory events rather than acoustic events EVIDENCE: speech sounds are highly variable we are interpreting gestures rather than sounds may still be right
29
motor theory fMRI evidence FOR
task: listen to meaningless monosyllables outcome: auditory cortex activated (audio is being processed) BUT motor and premotor areas are also activated which is evidence for us interpreting sound gesturally
30
motor theory TMS evidence FOR
- TMS over premotor areas interferes with phoneme discrimination in noise but not colour discrimination MOTOR AREAS ARE CAUSALLY INVOLVED IN SPEECH PERCEPTION
31
motor theory evidence AGAINST
- categorical perception can also be demonstrated for non-speech sounds (e.g. musical intervals) > so not a result of a specialised speech module - with training chinchillas show the same phoneme boundary for da/ta continuum as humans > not uniquely human
32
classic model of brain basis of speech perception
- superior temporal gyrus for speech perception (Wernicke's area) - inferior frontal gyrus for speech production (Broca's area) - left hemisphere dominant
33
more up to date model of brain basis of speech perception: dorsal and ventral streams
- 2 streams for speech processing that are engaged in a task dependent manner dorsal stream: mapping speech sounds onto articulatory representations - activated for tasks focusing on perception of speech sounds - e.g. phoneme perception ventral stream: mapping speech sounds onto lexical representations - activated for tasks focussing on comprehension e.g. word recognition - can explain why some aphasics can't tell apart phonemes but can recognise words and vice versa
34
dorsal stream - brain basis or speech perception
- mapping speech sounds onto articulatory representations - activated for tasks focusing on perception of speech sounds - e.g. phoneme perception - left hemisphere dominant - Broca's area = involved in perception NOT JUST PRODUCTION
35
ventral stream - brain basis or speech perception
- mapping speech sounds onto lexical representations - activated for tasks focussing on comprehension e.g. word recognition - bilateral - left AND right hemispheres
36
evidence for ventral stream processing
- anterior temporal damage associated with semantic impairment (ventral) inferior temporal damage associated with comprehension deficits (ventral)
37
evidence for dorsal stream processing
- listening to syllables activates motor and premotor areas (dorsal) - TMS over premotor areas interferes with phoneme discrimination in noise but nor colour discrimination (dorsal)
38
process of recognising spoken words: cohort model
- set of word representations in your mind - what words should sound like - lexicum - if you hear 'c' all words in lexicum that start with the sound 'c' - as time goes on and you hear more and more of the word - less potential words are activated until... UNIQUENESS POINT: time-point in the speech when inly one word become consistent with the speech input > word is recognised at UP even before whole word is produced
39
uniqueness point
time-point in the speech when inly one word become consistent with the speech input > word is recognised at UP even before whole word is produced = optimal efficiency
40
key features of cohort model
- words are activated immediately upon minimal input - multiple words are activated - words compete for recognition - lexical competition
41
cohort model: evidence from shadowing task
- average response latency was 250ms - average duration of words was 375ms - recognising at uniqueness point
42
limitations of cohort model
- verbal model - so hard to evaluate - solution: a computer model
43
TRACE computer model of speech perception
words phonemes acoustic features - connections between levels are bi-directional and excitatory (TOP-DOWN EFFECTS) - connections within levels are inhibitory producing competition between alternatives
44
TRACE model from eye tracking
- task: take one item and move it to a different location as more of spoken word of object is revealed, the more they start to look at the object and even rhyming/similar named objects can put this into a computer - similar results mean human trials are doing a good job at modelling dynamics
45
TRACE and context
for the gift kiss kift giss - if you hear a sound in context of the rest of the word - you are biased to that because you recognise it thats why if told ist- gist is preferred and i told -iss kiss is preferred