Problem 9 - DONE Flashcards
speech perception
how do we produce sounds?
- lungs: respiration
- vocal tract: articulation
- -> oral tract + nasal tract - vocal folds: phonation/voicing/fundamental frequencies of speech
- -> larynx
speech sounds
= produced by position/movement of structures within vocal apparatus which produce acoustic signal (= pressure changes in air)
- air is pushed up from lungs
- past vocal cords
- into vocal tract
- shape of vocal tract: altered by moving articulators (= structures such as tongue, lips, teeth, jaw)
- -> articulation
production of vowels
- by vibration of vocal cords
- -> specific sounds of vowels: by changing overall shape of vocal tract
1. change in shape
2. changes resonant frequency of vocal tract
3. produces formants - formants = frequencies at which peaks occur (first formant F1: lowest frequency; second formant F2: the next highest…)
- -> each vowel sound: characteristic series of formants
sound spectrogram
= three-dimensional display;
- -> horizontal axis: time
- -> vertical axis: frequency
- -> colour: amplitude (intensity) (redder = greater intensity/ grey scale)
- -> indicates pattern of frequencies and intensities over time that make up the acoustic signal
production of consonants
- by constriction/closing of the vocal tract
- -> movements of articulators: create patterns of energy in acoustic signal
- formant transitions = rapid shifts in frequency preceding/following formats
- -> associated with consonants
phonation
= vocal folds are made to vibrate when air pushes out of lungs
articulation
= shape of vocal tract is altered by moving articulators
resonant frequency/characteristics
= changing size and shape of space through which sound passes increases/decreases energy at different frequencies
frequency spectra
= represents sounds that do not vary over time
phoneme
= shortest segment of speech
- change phoneme: change meaning of word
- number of phonemes: varies across languages
- number of vowels: greater than vowel letters, because of different pronunciation
- phonemes form –> syllables –> form words
variability problem
= variable relationship between acoustic signal and sound we hear
- -> particular sound can be associated with number of different acoustic signals
- variability from context
- variability from different speaker
variability form context
- -> context in which phoneme occurs influences acoustic signal
- coarticulation = overlap between articulation of neighbouring phonemes
- -> perceptual constancy: we perceive sound of phoneme as same even though acoustic signal is changed by coarticulation
variability of different speakers
- -> different speakers pronounce in different ways
- individual differences: pitch of voice, pace of speaking
- sloppy pronunciation: not articulate each word individually
solutions for variability problem
- categorical perception
- information from facial expression
- information from our knowledge of language
categorical perception
= occurs when stimuli that exist along a continuum are perceived as divided into discrete categories
{- vision: along the visible spectrum = five categories}
- speech: continuum = voice onset time (VOT)
–> voice onset time (VOT) = time delay between when a sound begins and when the vocal cords begin vibrating
–> phonetic boundary = VOT when the perception changes from one category to the next
=> even though the VOT is changed continuously listener perceives only two categories: /da/ on one side of the phonetic boundary and /ta/ on the other side
–> perceptual constancy: all stimuli on same side of the phonetic boundary are perceived as the same category
information provided by the face
- speech perception = multimodal: can be influenced by number of different senses
- audiovisual speech perception = influence of vision on speech perception
=> auditory information: major source of information; visual information: strong influence on what we hear
McGurk effect
–> hears sounds /ba-ba/, when visual stimulation is added showing person making the lip movements for the sound /ga-ga/, listener begins hearing the sound /da-da/
information from our knowledge of language
- -> easier to perceive phonemes that appear in meaningful context
- phonemic restorations effect = sounds missing from speech can be restored by the brain and appear to be heard
- -> speech perception can be determined by:
- bottom-up processing: nature of acoustic signal
- top-down processing: context that produces expectations in listener
- -> can be influenced by meaning of words following missing phoneme
perceiving words in incomplete sentences
- -> words/sentences can be read even if incomplete
- familiarity with language of the present sentence
- -> knowledge of rules (grammar)
- meaningfulness makes it easier to perceive spoken words
speech segmentation
= perception of individual words in a conversation
- not only based on energy stimulating the receptors –> knowledge of meanings of words + use of context in which these words occur
transitional probabilities
= ways sounds follow one another in a language; chances that one sound will follow another sound
- -> certain sounds are more likely to follow one another within a word + other words are more likely to be separated by the space between two words
- -> statistical learning = process of learning about transitional probabilities and about other characteristics of language
characteristics of speaker
indexical characteristics = characteristics that carry information about speaker (age, gender, place of origin, emotional state, whether they are being sarcastic or serious)
- speaker’s tone of voice
- speaker’s identify
speech perception
depends on:
- bottom-up information: acoustic signal
- top-down information: meanings of words/sentences, listener’s knowledge of the rules of grammar, information listener has about characteristics of speaker’s voice
cortical locations of speech perception
- Broca’s area: in the frontal lobe
- Wernicke’s area: in the temporal lobe
- voice area: superior temporal sulcus (STS)
- -> activated by human voices
- -> voice cells = temporal lobe; respond to voice sounds
brain damage and speech perception
- aphasia = language problem caused by damage to specific areas in the brain
- -> symptoms: depending on area damaged + extent of damage
(1) Broca’s aphasia = laboured and stilted speech, can only speak in short sentence; capable of comprehending what others are saying
(2) Wernicke’s aphasia = speak fluently, but what they say is extremely disorganised and not meaningful; great difficulty understanding what other people are saying - -> most extreme form: word deafness = cannot recognise words, even though ability to hear pure tones remains intact
(3) damage in parietal lobe: difficulty discriminating between syllables
dual-stream model of speech perception
- ventral (what) pathway:
- ->identifying sounds/recognising speech
- -> starting in temporal lobe
- dorsal (where) pathway:
- -> locating sounds/linking acoustic sidle to the movements used to produce speech
- -> starting in parietal lobe