Speech Perception & Comprehension Flashcards
1
Q
SPEECH = VARIABLE
A
- every word takes dif acoustic shape each time it’s uttered; due to:
1) speaker (vocal track size/regional accent/socio-economical tier)
2) articulation rate (4/5 syllables/sec in sentences)
3) prosody (music of speech ie. rhythm/melody/amplitude)
4) mode (voiced/whispered/creaky)
5) coarticulation (individual phonemes influenced by preceding/upcoming segments ie. regressive/progressive assimilation)
2
Q
VISUALISING SOUND
A
- 2 main ways:
1) WAVEFORM - y-axis represents amplitude (w/0 on horizon); x-axis represents time
2) SPECTROGRAM - derived from Fourier transform to represent time on x-axis
- y-axis = frequency/energy (ie. amplitude)
- colour = 3rd dimension (aka. brighter = stronger)
3
Q
SPEECH = QUASI-CONTINUOUS
A
- no unique/systematic way to flag word boundaries aka. rarely silence between 2 words
- short silences (100ms) typically correspond to vocal tract closing to produce so-called plosive/STOP consonant in “pocket”
4
Q
SPEECH = LEXICALLY AMBIGUOUS
A
- words = made of limited number of sounds/syllables aka. embedded words = everywhere inside other words
- ie. captain -> cap
- ambiguity also arises due to straddling words as soon as we put 2 words together
- ie. clean ocean -> notion
5
Q
SPEECH = AUDIOVISUAL
A
- visual info given by lips/adjacent facial areas about articulation = integral to speech perception when available
6
Q
MCGURK & MCDONAL’S ILLUSION (1976)
A
- visual signal should be weakly constraining for it to work aka. visual /ga/ = ^ ambiguous > visual /ba/
- /ga/ = don’t actually see if speaker is closing glottis
- so visual cues = also compatible w/ /da/
- visual /ba/ = unambiguous as you see lips closing preventing illusion from occurring
- visual signal must be compatible w/both back/medial closure of vocal track (/ga/ VS /da/); conflict w/front closure implied by auditory /ba/ attracts perception towards mid-point between front/back of mouth (/da/)
FUSION - /ga/ (vision) + /ba/ (audition) = /da/ (perception)
7
Q
INFO FOR IDENTIFYING WORDS
A
PHONEMES
SUPRA-PHONEMIC INFO
8
Q
PHONEMES
A
- building blocks of vocab
- smallest units in signal allowing meaning distinction (ie. bat/mat have 3 phonemes & differ by 1st one)
- limited number so words are created by combining them in unlimited ways specific to language
- English = 20 vowels & 24 consonants
9
Q
SUPRA-PHOENEMIC INFO
A
- prosody/music of speech (ie. rhythm/melody/energy) ie:
1) lexical stress/accentuation (ADmiral/admiRAtion)
2) tones (same strong of phonemes can have dif meanings depending on pitch contour in some languages ie. ma in Mandarin (horse/mother/scold)
10
Q
SUPRA-PHOENMIC INFO: DAHAN ET AL. (2001)
A
- carried by larger chunks > phonemes ie. syllables
- languages vary in terms of importance of supra-phonemic info for recognising words (ie. French < English < Mandarin)
- phonemic/prosodic info is needed for lexical distinctions BUT word recognition = also sensitive to subtle articulatory details ie. co-articulation cues
- the way in which vowel is pronounced/sounds depends on identity of following consonant
11
Q
SPEECH = MENTAL CATEGORIES
A
- when presented w/exemplars along continuum of syllables between 2 end-points (ie. gi-ki) we perceive whole continuum section as 1 category (ie. gi) while the other is a separate category (ie. ki) despite physical changes in category
- aka. step-like shift indicating category boundary at some point in continuum
- we experience stimulus as either 1 or other BUT not as in-between aka. categorical perception
- most obvious in consonants (ie. rapid acoustic changes) > vowels/tonal info (steadier/continuous)
12
Q
CATEGORICAL PERCEPTION IN DISCRIMINATION TASKS
A
- can also occur in discrimination tasks
- hearing dif between 2 adjacent exemplars in continuum is maximal at category boundary (ie. across categories) BUT at chance within category
- category boundary lies roughly at location of continuum for all speakers of given language
13
Q
CATEGORICAL PERCEPTION IN CONSTNANT CONTRASTS
A
- cannot be easily demonstrated on all contrasts as you need to identify key parameters involved in contrast & latter must be easily manipulated
- ie. voicing distinction (pa/ba; ga/ka) = regulated by 1 acoustical parameter aka. Voice Onset Time (VOT) corresponding to noisy segment from consonant explosion release up to start of periodicity in vowel
- aka. voiced consonants (b/d/g) = shorter VOT > voiceless counterparts (p/t/k) in English
14
Q
VOICE ONSET TIME (VOT)
A
- can be manipulated to create continuum from voiced consonant to voiceless counterpart (ie. gi VS ki) & see if perception follows progression along continuum linearly VS showing mental categories
- pps asked if 2 stimuli adjacent on continuum = same/dif acoustically -> maximal discrimination occurs at perceptual boundary & would be at chance for all other adjacent comparisons
15
Q
WERKER & TEES (1984)
A
- examined ability of English infants to discriminate non-native (ie. Hindi/Salish) contrasts during 1st year of life
- cross-sectional/longitudinal approaches using conditioned head-turn paradigm
- newborns come to life equipped to deal w/any possible phonetic contrast
- non-native contrasts disappear w/exposure to language BUT native contrasts = maintained
- aka. infants transform language-general phonetic skills -> language-specific phonological abilities via “winnowing” (aka. narrowing down) initial set of “innate” discrimination abilities