Chapter 12: Perceiving Speech And Music Flashcards
Speech Perception
- Deals with how language sounds are perceived
- Involves relationship between perception and production
Phonemes
- smallest unit of sound that can change meaning of word
- sounds we can pronounce
Morphemes
Smallest unit of sound that provides meaning to word
International Phonetic Alphabet (IPA)
- alphabet in which each symbol stands for different speech sounds
- provides distinctive way to write each phoneme in all human languages currently in use
Producing the sounds of speech
- speech starts in the brain
- after a speaker determines what to say, the other parts of the sound production system come into play
Difference in fundamental frequency
Male (85 Hz)
Female (150-200 Hz)
Children (300+ Hz)
Vocal Folds
Aka vocal cords
- pair of membranes within larynx
Larynx
Aka voice box
- part of vocal tract that contains vocal fold
Pharynx
Uppermost part of throat
Uvula
Flop of tissue that hangs off posterior edge of soft palate
- can close off nasal cavity
Speech Production System
Influences by contraction and relaxation of throat muscles and tongue activity
Vowels
Produced with relatively unrestricted flow of air through pharynx and oral cavity
- uninterrupted, unrestricted flow
Formats
Frequency bonds with relatively high amplitude in harmonic spectrum of vowel sound
Consonants
Produced by restricting flow of air at one place of another along path of airflow vocal folds
Place of Articulation
In production of consonants, points in vocal tract at which airflow is restricted, described in terms of anatomical structures involved in creating restriction
- closing of lip
- top teeth and bottom lip
- tongue behind upper teeth
Manner of articulation
Nature of restriction of airflow in vocal tract
- whether air is fully stopped or just restricted
Voicing
Specifies whether vocal folds are vibrating or not (whether consonant is voiced or voiceless)
- whether vocal fold vibrate or not
Vowel Sounds: Production and Frequency Spectrum
Speech sound changes formant
Formants= harmonics with increased amplitude for specific sound
Sound Spectogram
Graph that includes dimensions of frequency, amplitude, and time, showing how frequencies corresponding to each sound in utterance change over time
Phonemes cannot be identified by mapping […] to specific phonemes
Phonemes cannot be identified by mapping specific frequencies to specific phonemes
Speech sounds vary, even with the same speakers, for a variety of causes
- sloppy enunciation
- speaking with mouth full
- coarticulation
Coarticulation
Influence of one phoneme on acoustic properties of another due to articulatory movements required to produce them in sequence
Variability in the Acoustics of Phonemes
Ex. The difference between “resisting arrest” and “resisting a rest”
Categorical perception of phonemes
- refers to the perception of different sensory stimuli as identical, up to a point at which further variation in the stimulus leads to a sharp change in the perception
- means that a change in some variable along a continuum is perceived not as gradual but as an instance of discrete categories
- opposite of continuous perception (no sharp changes in perception
Categorical perception and voice onset time (VOT)
Categorical perception happens when categories that observers possess influence that observer’s perception
Ex Ba vs Da
Voice Onset Time (VOT)
In production of stop consonants, interval between initial burst of frequencies and onset of voicing
*varies due to consonant place and manner of articulation
Phonemic Boundary
VOT at which stop constant transitions from being perceived as voiced —> voiceless
*when sound started vs when voicing started
McGurk effect
In perception of speech sounds, when auditory and visual stimuli conflict , the auditory system tends to compromise on a perception that shares features with both seen and heard stimuli
Knowledge takes 3 forms
- Knowledge of the grammatical rules of the language and the context in which an utterance is produced
- Knowledge about the probability of various sequences of phonemes within words or across words in the language they’re hearing
- Knowledge of the specific words that are expected in a particular situation
[…] speech is easier to perceive than […] speech
Grammatical speech is easier to perceive than ungrammatical speech
- grammatical> anomalous>ungrammatical
- ungrammatical speech is characteristic of Wernicke’s area
Word Segmentation
- in general, perception of language is a clear separation between words (segmentation)
- in reality, talking involved creating a continuous, connected stream of sounds— except when pausing
Word Segmentation: A different type of perceptual challenge relates to the […] between words in the sound stream of normal speech
Word Segmentation: A different type of perceptual challenge relates to the indistinct boundaries between words in the sound stream of normal speech
Infant Learning of Transition Probabilities
Infants can predict what words come next
Phoneme Transition Probabilities
For any particular sequence of phonemes, the changes that sequences occur at start of a word, at end of word, or across boundary between two words
Phonemic Restoration Effect
Kind of perceptual completion in which listeners seem to perceive obscured or missing speech sounds
Results of Shahin and Miller’s (2009) study reinforce the conclusion that […]
Results of Shahin and Miller’s (2009) study reinforce the conclusion that knowledge is important in phonemic restoration
*knowledge of the mouth movements associated with specific words and their phonemes
Aphasia
Impairment in speech production/ comprehension (or both) caused by damage to speech centers in brain
Broca’s Aphasia: speech production
Wernicke’s aphasia: speech comprehension
Globus aphasia: arcuate fasciculus
Speech Pathways in Brain
Ventral Pathway: meaning and combo of words (“what”)
Dorsal Pathway: production of speech using motor system (“where/how”)
Music: No other creature seems to have the ability to compose music other than humans
- music has the ability to evoke emotional responses from humans
- understanding of music requires an appreciation of pitch loudness, timing, and timbre combinations that composers can use to create a musical experience
Pitch
Perceptual basis of organization with notes separated by proportionally equivalent intervals
- notes separated by an octave are perceptually more similar than notes separated by some other intervals - semitone intervals are perceptually equivalent to one another
Octave
Sequence of notes in which fundamental frequency of last note is double the fundamental frequency of first note
Ex. A4= 220 Hz, A5= 440 Hz
Semitones
12 proportionally equivalent intervals between notes in octave
- difference between A1 and A1# and A8 and A8#
C3 and C4 are perceptually […] than B4 and C4
C3 and C4 are perceptually more similar than B4 and C4
Pitch helix illustrates similarity among pitches […]
Pitch helix illustrates similarity among pitches geometrically
- Tone chroma - Tone height
distance between successive notes along helix is constant— perception of constant difference in pitch
Tone Chroma
Difference in pitch within octave
Tone Height
Octave in which pitch appears
Dynamics
Manner in which loudness varies as a piece of music progresses
Rhythm
Temporal patterning of events in a musical composition
- Tempo
- Beat
- Meter
Tempo
How fast/ slow overall piece is
Beat
Equally spaced pulses that can express fast or slow tempo
Meter
Temporal patterning of strong and weal pulses in beat over time
Dimensions of Music: Timbre
Attack and decay: ways in which harmonic components begin than fade away
Melody
Sequence of musical notes arranged in particular rhythmic pattern, which listeners perceive as single, recognizable unit
Melodies can be recognized even when the notes are transposes to a different musical key
- infants recognize melody at 6 months
Transposition
Two versions of same melody, containing same melody, containing same intervals but starting at different notes
Scales
Particular subset of notes on octave
- major and minor scales
Key
Scale that functions as basis of musical compositions
Harmony
Consonance and Dissonance
- some combinations of notes are consonant while other are dissonant
- this is due to the harmonicity or lack thereof in the harmonics of the combined tones
Consonance
Quality exhibited by combo of 2+ notes form scale that sound pleasant
Dissonance
Quality exhibited by combo of 2+ notes from scale that sounds unpleasant
Harmonicity
Extent to which harmonics of notes played in combo coincide with harmonics of notes with lower fundamental frequency
Gestalt Principles of Melody
- Proximity
- Similarity
- Closure
Neural Basis of Music Perception
Once neural information leaves the primary auditory cortex, the brain has areas that are more active when processing certain types of sounds
Fixed-pitched sequence vs silence
More active left and right auditory cortex
Changing-pitch sequence vs fixed-pitched sequence
Only right auditory cortex responds
Color Music Synesthesia
combining color whenever music is played
*each note has its own color
Mirror Neurons in Music
When non-musicians listened to songs they learned on piano, brain areas for music perception and finger movements were activated
Absolute Pitch
Listened to isolated notes and same them accurate and efforlessly
Antonia
Can’t match of identify pitch
Amusia
Profound impairment in perceiving and remembering melodies and in distinguishing one melody from another
- congenital or developed after brain damage
- 4% of population
- can sound like pots and pans
Musical training and experience
Solidifies musical areas in brain
- magnitude of brain activity between musicians and nonmusicians is higher in experience musicians
- both groups showed equivalent patterns of activity
Language, Culture, and Music
- Music and Language- some languages are more lyrical than others
- Learning and Culture and affect Music Perception
Musical Illusions
Sheperd Tones
Octave Illusion
Tritons Paradox
Sheperd Tones
Layered tone separated by one octave
- top line gets quieter, middle line stay the same, and bottom line gets louder
- Pitch increases
Octave Illusions
2 notes that are one octave apart are played alternatively
Tritone paradox
Sequentially paired Sheperd tones that some people perceive as ascending and others perceive as descending
Automatic Speech Recognition
Accurate perception of human speech by machines
Speech Perception by Machines
- Many modern devices incorporate automatic speech recognition (ASR) to allow user control via spoken commands and requests
- The first step in the process of ASR is to convert the waveform of the utterance into a set of feature vectors that capture the spectral information in the speech
- Then the acoustic parameters that characterize the speech sounds are computed
- Finally, linguistic and lexical information is used to help guide the search for the most probably word sequence corresponding to the acoustic parameters (hypothesis search)