Week 8 - Audition and Music Flashcards
Recap; Soundwaves
= Alternating Period of Acoustic Energy that reach the ear and cause vibration of the tympanic membrane
Vibration on the pathway of soundwaves
Soundwaves cause vibration of the tympanic membrane, which vibrates the ossicles and sends vibrations through the cochlea.
This causes hair cell activity at frequency-specific locations of the basilar membrane. This triggers electrical signals that travel up through the ascending pathway to the cortex
Where and What processing streams in the auditory system
Are for localisation and identification of sound
This is maintained all the way up to the cortex and within the cortex (A1, belt and parabelt regions that form the auditory cortex)
Hearing and Hearing Loss According to Psychophysics; an intro to methods and frequency loudness interactions
There are specific methods that deal with the potential experimental confounds of frequency-loudness interactions (SPL)
– eg. an auditory threshold for a 100hz tone may be 50 dB but auditory threshold for a 1000hz tone may be 5db
Audibility Functions
Is a graphical representation of audibility
In this, everything below the line is inaudible (refer to graph in notes)
This line is a curve, showing that the absolute threshold of sound (0dB) isn’t that absolute. Low frequencies require a greater intensity (dB or SPL) to be heard.
As such, we need careful experimental design in hearing tests, bass boost in stereos to hear low tones
Hearing Loss
Is the most common sensory diability
Affects 5% of the worlds population and estimated to reach 2.5 bill people by 2050 due to our noisier lifestyle and aging population
There is a range of severities of hearing loss depending on your absolute threshold for sound
0-20dB = normal/healthy
20-40dB = mild loss
40-60dB = moderate
60-80dB = profound
>82 dB threshold for sound = clinically deaf
Note: The diagnosis is based on sounds between 125 - 8000Hz but other frequency ranges may differ
Earplugs
Reduce SPL by about 35dB
But be aware of sensory sensitisation with thresholds getting lower to accommodate for the ear plugs
6 Types of Hearing loss (list)
- Conductive
- Sensori neural
- Mixed
- Age-related (Presbycusis)
- Exposure
- Sociocusis
Conductive Hearing Loss
= an Impairment in the outer/middle ear’s ability to transmit/ amplify sound which results in reduced sensitivity to all frequencies
Could be the result of ear wax, infection, tympanic membrane perforation/rupture, eustachian dysfunction, otosclerosis of ossicles etc
Is a mechanical issue and is broad spectrum as it affects all hearing frequencies
SensoriNeural Hearing Loss
In this, the outer and middle ear have no impairments, instead the inner ear and signal transduction/transmission (to the brain) systems are impaired. This causes a reduction in sensitivity to specific sound frequencies, or a total loss
Can be broad spectrum or specific to certain frequencies
Mixed (Sensory and Conductive) Impairments
Combination of inner and outer/middle ear factors
eg. If a factor worker damages a cochlea from exposure to loud noise and has profound blockages or a burst eardrum in the other ear
Age-Related Hearing Loss
Aka Presbycusis (Gr. Presubus ‘old’ and akouo ‘hearing’)
Men lose more capability than women with aging
At 25 <20kHz
at 30 <15 kHz
at 50 <12 kHz
This is the process of natural aging. Could be due to a loss of cohlea elasticity, or a lack of nutrients vital to cochlea health but the true cause is unknown. Lokely a sensory/neural issue than a middle or outer ear impairment
Exposure
Exposure deafness is associated with non-work-noise-induced hearing loss
- a big confound as most studies of presbycusis don’t account for people’s exposure to noise over their lifespan
So is exposure an aging effect or the product of a noisy environment? (likely both)
Can be the result of both loud noises which are damaging, but also sustained long-term noise exposure to noise over 80dB.
Linked to
- music - loud earphones, concerts etc
- household noise - vacuum, blenders, etc
- Transport - cars, trains, motorcycles
- Crowds - bars, sporting events
-Occupation - factories, construction, baristas
Hearing and Hearing Loss - Sporting Game (Hodgetts and Liu 2006)
Went around ice hockey stadium and measured the noise exposure plotting this across time
80dB or more is dangerous to hearing and researchers found sound only dropped below this marker at the very end of the game when people were leaving the stadium
- This was 8100% of their daily allowable noise dose
Noisy Lifestyle and Hearing Loss ; Case study
The Mabaan people of Sudan are a group of superhearers.
Their lifestyle doesn’t involve lots of sound so they have begun talking quietly and can now hear quiet speech far better than most
Ways to Mitigate Hearing loss (list)
- Amplify Acoustic Energy; eg. hearing aids
- Prosthetics to replace ossicles
- Cochlear Implants
- REgenerating Hair cells
Hearing Aids
Means to mitigate hearing loss. Work by amplifying the acoustic energy via a hearing aid
- Microphone picks up incoming sound
- A computer processer separates the ‘signal’ from the background ‘noise’ (tuned to your needs)
Then the amplifier increases the magnitude of relevant sounds
Speaker then projects these into the ear
This would be best for sensorineural deafness
Some people do not enjoy the social sitgma of a hearing aid
Prosthetic Ossicles
In some cases the ossicles become damaged or calcified
Replacing any or all of the ossicles is possible and can assist in amplifying sound saves
It isn’t a cure all for deafness and is an invasive procedure
Cochlear Implant
Is a small electronic device, surgically implanted to electrically stimulate the sochlea nerve
Bypasses the damage portions of the ear(including the organ of corti)
It is invasive
Some people also find the implant quite harrowing as hearing after long periods of sound deprivation can be overwhelming
It also isn’t an instant fix. Once you can hear with the implant, patients have to learn sounds. So this is the role of plasticity which can take months/years for the reorganisation of the auditory cortex and speech therapy to understand/comprehend speech
Regenerating Hair Cells
Is a not yet available treatment
The idea is that even though there are no stem cells in the human ear, so we cannot regenerate cells naturally once they’re damage or lost. We could grow cells ex vivo and insert them
Speech Perception
Audition serves two functions
1. Identify, localise and react to things in the environment
and
2. Communicate
- audition is key for receiving and producing communication
- Allows for different modes of communication too (non-speech vocalisation, speech vocalisation and music)
Vocalisation
= any sound made via respiratory system and used in communication
eg. laugh or heavy sigh
Speech
= Any sound made by the respiratory system that produces acoustic patters that accord with the phonetic structure of a language
Speech Perception and the Vocal Cords
The The vocal cords vibrate and can remain open, closed or partially open which differentially affects the vibrations as air passes them and therefore vocalisation
Speech PErception (4 stages)
- Initiation; air is pushed from the lungs and provides the molecular disturbance
- Phonation; air crosses the vocal cords
- Oro-nasal process; Resonates through larger oral/nasal cavities
- Articularion; by throat, tongue, lips, teeth, jaw; 100s of fine movements a second
Speech production Vowels Versus Consonants
Vowels - produced by noise from an open flow through the vocal tract
Consonants - noise involving flow through patterned constrictions of the vocal tract
Speech Perception Requires
- Hearing
- a functional auditory system
- 3 basic levels of auditory processing (including identification, localisation, signal-sound ratio) - Speech processing
- semantic information
- paralinguistic information = everything aside from the words like pitch, intonation, knowledge of the person speaking etc. Ie. Person speaking via speed and use of gestures and affective state/intentions via the intonation of speech)
Phonemes
= the Fundamental building blocks of speech
= the distinct sounds used to create words in spoken language
= the fundamental ‘unit’ of speech, if you change a phoneme, you can change the meaning of a word
are written as /b/ /p/ /t/ etc (as in bear, tear, pear etc)
Every language has its own phonemes
These tend to appear at around ~12 phonemes per second in a normal rate of speech (cadence
Each phoneme has a specific pattern of acoustic energy (specific to the phoneme and the individual
Speech Perception; Semantic Information ; Formants
When we vocalise a phoneme, there are peaks of acoustic energy at specific frequencies. These are called formats
Each vowel-related phoneme has a characteristic formant pattern made up of these frequencies
Consonants provide formant transitions - or rapid changes in frequency
Formants can be displayed on a spectrogram.
Vowels produce stable formant frequencies so in spectrograms so like flat lines
Formant transitions are unique frequency changes associated with the onset of a consonant sound and are different for different consonants (good diagrams in notes)
Spectrograms
Formants and A string of Phonemes can be shown in spectrograms.
These flow frequency over time or frequency over word to show the unique pattern of sound frequencies associated with a given vocalisation
These are specific to the word and specific to the individual
These unique frequency patterns are good for identifying who the speaker is
Also play a role in technology - ie. voice recognition
Spectrograms have information on frequencies, cadence, amplitude (indicated by thickness of lines)
There are rapid changes that occur across short time scales
It’s not known how phonemes are strung into different words in the brain or how we understand spectrogram data to perceive vocalisations differently but it’s important to understand this as it could be used to treat perceptual difficulties
How many children have perceptual speech impairment?
~5%
Typically involves a deficits in the fast (10’s of milliseconds) temporal processing needed to distinguish brief formant transitions
So we need to know how the brain binds phonemes into wordsto understand and treat these perceptual deficitis
Theory of Speech Perception (semantic information) - pattern recognition
Asks how the nervous system processes different speech sounds and whether this is a bottom-up process. Ie. maybe each phoneme has a corresponding pattern of specific fibre activation
eg. If the ear receives a /p/ sound there is coactivation of fibres 1,60 and 95.
this theory seems logical but the frequency of sound changes when a phoneme is paired with certain vowels.
So, many phonemes involve different frequencies (yet we hear the same sound).. This means the stimulus the ear is received is different and yet the perception is the same. So pattern coding cannot explain this
Also, there are individual differences when it comes to formant sequences.
Perceptual Constancy
= When the stimuli changes, but the percept remains the same
In the speech system, perceptual Constance occurs at the word and sentence level.
eg. words with different intonatiosn are still perceived with the same meaning
or “what are you doing” is understood the same as :whaddyaya doing”
Top-down influences on speech perception
Clearly there are bottom-up influences of speech due to the need for incoming sensory signals
There is however, also a role of cortical processing to assign meaning to these signals (aka top-down processing)
This required experiences - as this builds a knowledge base to assist in interpreting and understanding sensory information (recognition stage of perception) As we match sounds to a mnemonic template
Role of Auditory Cortex
A1 = Primary auditory cortex + secondary and Tteriary bands (aka belts and parabelts)
A1 does not preferentially respond to speech - it responds to any noise/sound
But some areas do preferentially respond to speech;
eg. Wernickes; critical for understanding spoken word
Broca’s area; critical for articulating words (motor)
– both are in the left hemisphere)
Brocas’s Aphasia
Condition resulting from damage to Brocas area in the left hemisphere
This is where hearing is intact but we cannot vocalise speech
Wernickes Aphasia
Hearing is intact but unable to comprehend speech
Results from damage to left wernickes area
Note about Broca’s and Wernicke’s Aphasia
- while damage to left broca’s or wenicke’s area triggers these characteristic deficits we should not infer that these regions are the regions responsible for speech comprehension and vocalisation
Left broca’s actually inactivates during vocalisation. It is a premotor not a motor area. And cooling in surgery does not results in halted articulation - just slowed down
Left brocas is involved in preparation for articulation - gates the motor program required to execute speech
Paralinguistic Information
Mutliple areas of the right hemisphere are involved in processing info on affect, intention, person speaking (so are part of the language network)
eg. STS (or socially meaningful information)
or limbic system to detect the motion of the speaker
or prefrontal cortex; for person perception and social perception
Note; these regions are connected. Strong activation to emotional intonation and non-speech utterances and strong activation during speaker identifiication
Damage to the STC, PFC and limbic regions
These areas are associate with processing paralinguistic information
so, damage to these areas = phonetic content processing intact but unable to understand intonation (dysprosody) or inability to identify speaker (phonagnoisa)
Can still recognise the emotional content of speech unless the limbic system is damaged
Right Homologue of Broca’s Area
Associated with processing paralinguistic info
Process prosody (rhythm, stress, intonation) and helps to interpret emotion in speech content
Lateralisation of Speech Perception
The functional auditory system is present in the left and right (this describe the pathway from the pinna to a1)
the A1 and parabelt/belt regions are also present in both the left and right region
The above focus on the principles of hearing (the 3 basic levels of auditory processing)
Then, there is speech processing.
Content is processed in the left hemisphere in wernickes and brocas
Paralinguistic information about the person speaking, their affective state and inteiontons is process in the right hemisphere in the STS, PFC and limbic system
What aspects of different parts of speech perception are bottom-up or top-down
- Phoneme recognition
- Acoustic energy pattern (bottom up)
- phonemic restoration effects (top down)
- indexical characteristics (top down) - Boundaries
- language appropriate combinations of syllables (top down)
- knowledge of vocabulary (top down) - Context
- language appropriate combinations of words (top down)
- topic-of-conversation appropriate combinations of phrases (top down)
Role of topdown/bottom up processing in phoneme recognition
- Phoneme recognition
- Acoustic energy pattern (bottom up) ; a combination of phonemes will generate a unique sound spectrogram associated with a specific word or sound
- phonemic restoration effects (top down) ; phonemic restoration effect; we use our knowledge of the language to automatically ‘fill’ in missing phonemes (use of our lexicon)
- indexical characteristics (top down) ; we use our knowledge of the speaker to correctly interpret their phonemes as we know their speech template
Phonemic Restoration Effect
We watched a video/auditory tape where the ‘s’ sound was marked but we still understood the word. This is the top-down filling in of sounds we may miss in the environment
Role of topdown/bottom up processing in Boundaries
- Boundaries
- language appropriate combinations of syllables (top down)
- knowledge of vocabulary (top down)
Breaks between words are an auditory illusion. We identify phoneme boundaries but they in reality do not exist.
we perceptually hear the words separated due to our prior knowledge of language
the only time we tend to pause in speech is to take a breath
This is why foreign language speakers seem unnaturally fast to many people as we can’t identify where their phoneme boundaries are
Is based on the statical probability of sounds being followed by other sounds (high probability combinations are more likely to be in the same word)
Role of topdown/bottom up processing in Context
- Context
- language appropriate combinations of words (top down)
- topic-of-conversation appropriate combinations of phrases (top down)
Critical for language learning
or how toddler speech is still understandable despite being broken
we learn to gauge the meaning of a sentence of the conversation topic form hearing just a few word combinations
Role of topdown/bottom up processing in Supplementary Sensory Input
An additional factors;
- bottom up; incoming visual input - aka lip reading and facial/hand gestures
(cross-modal inputs from visual and audition)
- we are natural lipreaders - it greatly aids speech processing for people with a cochlear implant or those with hearing aids and the normal population
- top down; previously gained knowledge of how to interpret that input
it takes years to become a native listener even for a baby learning it’s first language. This is a practise thing
Your phones set it honed in at infancy with unimportant phonemes being ignored from around the 6 months mark (ie. in Japanese /r/ and /i/ discrimination is not useful so japanese babies learn to ignore these
then important voewels and consonants become delineated
as knowledge of language progresses, top-down recognition of language-appropriate syllables, words, boundaries and context occurs
The McGurk Effect
Is a visual dominance effect
where the same sound is made but a different visual is placed with that soundbite. We tend to perceive the sound in alignment with the visual of the lips moving over the soundbite in isolation
Language Development in Babaies
- top down; previously gained knowledge of how to interpret that input
it takes years to become a native listener even for a baby learning it’s first language. This is a practise thing
Your phones set it honed in at infancy with unimportant phonemes being ignored from around the 6 months mark (ie. in Japanese /r/ and /l/ discrimination is not useful so japanese babies learn to ignore these
then important voewels and consonants become delineated
as knowledge of language progresses, top-down recognition of language-appropriate syllables, words, boundaries and context occurs
Motherese or Baby Talk
= the exaggerated speech seen with babies, in which parents adopt high-pitched speech, with frequent fluctuations in intonation, short sentences, slower tempo
this is consistent across languages
In NZ we do this the emost, with wellingtonians being the most expressive baby tlakers
All is thought to help children learn language structure and emotional content
But we shouldn’t do this forever
Babies are born with the capacity to recognise the global phoneme library but with time become language specific in their ability to hear and notice different phonemes etc
- experience dependent plasticity occurs in infance and only frequently heard phonemes persist
So, bilingual infants as young as 20 months of age are able to efficiently and accurately process two languages
and being bilingual confers many advantages throughout the lifespan most evident later in life.
What are the effects music can have on us?
Can be emotional cogntiive and even immune (likely as it reduces stress and stress confers immunity)
What is music
There is no perfect definition as music is a social construct. We all agree it exists and we know what it isn’t (eg. it’s not language) but it isn’t objectively definable
Aspects of music
1. contains melods (a tune - sequence of musical notes perceived as one entity) which is often repeated (so has some sort of rhythm and cadence)
2. we use music for entertainment not just communication
(so inherently differs from communication
323 definition
= organised sound used for entertainment, typically involving melody
Components of Music
The basic ‘unit of music’ = acoustic energy with a defined frequency
- Every note has a specific pitch or frequency
This is usually between 28HZ and 4186 Hz (range of a piano) - as this bandwidth is likely to be the emost pleasant - as higher pitches become uncomfortable
Notably a melody would rarely use this whole range - usually stick to a few octaves - each note can vary in loudness and duration (attack and decay of notes
Cochlear Implants and Music
As most music doesn’t span the full audible frequency bandwidth (20-20,000Hz) the limited number of stimulating electrodes in the cochlear implant, not all sound is able to be distinguished well enough for the experience of music to be enjoyed.
In speech, our pitch spans 1,000s of Hz so cochlear implants are able to distinguish changes between the range a stimulating electrode is specific for, but music melod could involve only 100s of Hz making it harder to detect these nuances in sound/pitch as only one or two channels would activate
To perceive music well, we need…
Very high acoustic resolution. Cochlear implants would need a very high channel count to do this
Maximal Acoustic Energy in Specific Frequencies
For instruments such as a piana, every separate key produces a maximal acoustic energy in a given specific frequency.
Going up an octave doubles this frequency
eg. there is a maximal disruption of the sounwaves linked to the a key being hit.
However, hitting one key will also produce other frequencies
Fundamental Frequency
= How every Key produces Maximal Acoustic Energy in a Specific Frequency
So, whilst hitting a key at A1, the greatest energy produces will be a t55HZ smaller disruptions at different frequencies will also occur
Harmonic Frequency
Harmonics= the lesser waves of acoustic energy produces at different frequencies along a mathematical pattern
eg. when A1 is hit, the greatest energy disruption is at 55Hz but there is a smaller proeudction of sound at 110 and 200 hZ
These are positive integer multiples of the fundamental frequency (positive meaning that you don’t get harmonics of lesser frequency that the fundamental)
A single not will therefore have a corresponding complex sound wave made up of multiple frequencies.
The peak wave is the sum of a set of in-phase frequencies occurring at the same time (good diagram on slide 87
Harmonic Profiles
Different instruments or voices will produce different combinations of harmonics - this is why the same note sounds different when played on different instruments
This is called the harmonic profile and perceptually contributes to timbre - the distinctive quality/character of an instrument (or voice)
Timbre is determined by the frequencies involves, the SPL of each frequency and the time profile of the note. Ie. how frequency and SPL vary over time (the attack and decay of a note)
What sort of information can one note contain in music?
Loudness (eg. forte
Pitch (eg. A at 220Hz plus harmonics)
Duration (eg. staccato)
Tombre; which includes the loudness, duration and pitch of a note
Perceptual Constancy in Music
Different instruments have different timbre because of the frequency spectra they generate. But you can still perceive these instruments as playing the same pitch as you simply need the fundamental frequency to be the same
And even if you remove the fundamental frequency and keep the relevant harmonics, your brain can fill in the gap and generate the perception of the missing fundamental note
(eg. in the tonotopic map of A1, neurons in regions encoding the missing pitch will fire even though the relevant auditory signal is absent )
Inter-individual Differences in Pitch Perception
Ptich perception differs between individuals, in identification and discrimination (JND)
eg. Absolute Pitch (aka perfect pitch) – rare but some blind people acquire this so there is a role for plasticity. Can identify or even produce a given pitch on cue, without any additional reference. 1 in 10k people. Is associated with early musical training, early blindness, early linguistic exposure
eg. Relative Pitch; can identify notes, but only in reference to one another, often seen in musicians
eg. Normal Pitch; can discriminate differences in pitch, but can’t identify notes
eg. Amusia (tone deafness); cannot discriminate small differences in pitch - can cause failure to recognise out-of-tune notes (and thus a tendency to produce them)
Experience-Dependant Plasticity and Music
Highly skills musicians have a 25% increase in cortical representation of musical tones
Sound localisation processing is highly developed in conductors for examples
- this is associated with changes in the brainstem, olivary nuclei where sound is localised
Brain processing and Music Perception
Musical notes themselves are supposedly non-referential but we can identify a feeling from certain melodies
When we listen to music we activate a lot of brain regions ;
- eg. auditory cortex for pitch and harmony
- eg. Limbic system for emotion
- eg. Hypothalamus for visceral responses - hear rate, hormone release and physiological responses
- eg. Medial temporal lobe for memory
Plus motor areas if you’re tapping or singing along
Perceiving musical tones as pleasant or unpleasant appears to be learned and has an emotional and visceral component due to the brain areas active when we listen.
Mood Therapy and Music
We all frequently use music for ‘mood therapy’
This became popular in the 70s and is now used to treat neurological conditions like alzhiemers and TBH
It’s not known if this is a mood alteration benefit or some form of plsaticity
Treating Aphasia and Music
Music therapy is used to treat aphasia which is implicated with damage in the left hemisphere ie. broca’s or wernicke’s
- but music is thought to engage the right belt and parabelt aswell as the right hemisphere homologue of brocas too
Oliver Sacks suggested that brocas area could be inhibited in some forms of aphasia, and that when the right homologiue is activated by music then disinhibits left brocas.
Notable electrically stimulating right brocas does seem to improve symptoms of aphasia
Other research shows that right broca’s is hyperactive in some aphasics to possible compensate for damage in left brocas and that inhibiting right brocas may improve symptoms of aphasia - so why does music work?
Effects of Music Therapy
Can emphasise rhythm, pitch, memory and vocal/oral motor components.
Essentially a way of learning to speak (again)
Binaural Beats and The Brainstem
Suggests that slightly different tones presented to each ear produces and auditory illusion - a third tone.
The frequency of the third tone = the absolute difference between frequencies of real tones
This is typically perceived as a low frequency beat
The detection of this originates in the superior olive of the brianstem
Some studies stress the ability for binaural beats to reduce anxietyand increase quality of life but this is bs.
Monoaural conditions in one study were able to entrain the cortex more strongly and both binaural beats and the control condition failed to regulate mood.
So, studies are mixed.- but it is a cool illusion