auditory perception Flashcards
cause of sound
vibration of an object
movement alternately squeezes air molecules together and pulls them apart
creates a longitudinal pressure wave in air
function of pressure against time
high points - portions where pressure is high
air molecules squished together
low point - low pressure
air molecules pulled apart
amplitude
distance between baseline and peak of wave
amplitude used to derive intensity
loudness
decibels
logarithmic scale of relative intensities
reduces wide range of amplitudes to smaller scale
calculated with reference to our hearing threshold
used to determine loudness
sound intensity level
way of representing amplitude relative to a reference perception of loudness
period
time taken to complete one wavelength
frequency
number of periods per second
pitch
attribute in terms of which sound can be ordered on a musical scale
timbre
refers to quality which can make two sounds with the same pitch and loudness seem dissimilar
related to complexity
pure and complex tones
pure and complex tones
pure = a single frequency
complex = made of more than one frequency
can be broken down into individual pure tones
how do sounds have a clear pitch?
partials must be integers multiples of the fundamental frequency
called harmonics
if a sound has inharmonic partials it will be unpatched
outer ear
visible part of the ear - auricle
not vital for perception but has an effect
shape of ear important to perception of sounds
ear canal
extends down to eardrum (tympanic membrane)
resonant frequency = 1-5kHz
middle ear
two membranes joined by bones:
eardrum - tympanic membrane
ossicles - 3 tiny bones
hammer, anvil and stirrup
oval window - membrane like eardrum
why are the bones needed?
vibrations must travel from air to fluid
creates an impedance mismatch
harder for vibrations to move through fluid than air
middle ear helps to deal with it
= impedance matching device
function of the 2 membranes
eardrum bigger than oval window
power of vibrations concentrated into oval window
lever action of hinged bones
action amplifies strength of vibrations
inner ear
cochlea structure
snail shaped
two chamber separated by the cochlear partition
filled with perilymphic fluid
cochlea function
work as a frequency analyser
breaks incoming complex sounds down into pure tone components
also works as a transducer
converts mechanical energy at these different frequencies into electrical activity to travel to the brain
cochlear partition
splits cochlea in 2
basilar membrane
runs with cochlea
on top of the membrane in organ of Corti
organ of Corti contains
tectorial membrane - hinges over top of basilar membrane
hair cells (inner and outer) - topped with steriocillia (smaller hairs)
how vibrations move around the cochlear
vibrations flow from oval window through first chamber through helicotrema (gap at end) down to round window (another membrane) an reflected back
vibrations move around through cochlear
base end - end nearest middle ear
apex - tip of curled up formation
transduction process
basilar membrane moves in response to vibration in the perilymph
membrane vibrates at the same frequencies as the incoming sound
these vibrations bend the stereocilia on the inner hair cells against tectorial membrane
allows positively charged ions to enter cell
triggers release of neurotransmitters and an electrical signal is sent up auditory nerve to the brain
how is pitch detected?
place and temporal coding
place coding
at the base, basilar membrane is anchored, narrow and stiff
at apex, it is free, wide and loose
means it has different resonant frequencies at different points along length
points of maximum BM displacement = frequencies of incoming sound
(displacement where BM is vibrating most)
stimulate specific sets of inner hair cells
activates specific auditory nerve fibres
the tonotopy (place coding) is represented all the way up to auditory cortex
temporal coding
BM moves in response to vibration in the perilymph
BM vibrates at the same frequencies as incoming sound
stereo cilia stimulated by peaks in BM vibration
means firing occurs at the same period of the incoming waveform
- known as phase locking
stereocillia stimulated at peak as hairs are brushed against tectorial membrane
- at point of maximum displacement
means firings happen at peaks of BM motion
so firing correspond to period of incoming waveform
- allows brain to detect the pitch of sounds
place coding at different frequencies
less reliable at low frequencies
areas of vibration on BM bigger at lower frequencies
less specific and precise encoding of pitch
temporal coding at different frequencies
breaks down at high frequencies
not enough time for cells to recharge
leads to temporal smearing
peaks happening too close for coding to work
firings overlap and brain doest know which firings correspond to which peak of waveform
function of inner hair cells
detect motion of basilar membrane
ion flow causes electrical signals to brain
function of outer hair cells
amplify and sharpen motion of basilar membrane
= cochlear tuning
means we have good ability to discriminate different pitches
ion flow causes mechanical changes
expand and contract
motions amplify and sharpen basilar membrane action = more precise
timbre coding
pitch represented by 2 mechanisms:
place coding = where on the BM firing comes from
temporal = when the firing is occurring
timbre represented by which combinations of fibres are active at the same time
intensity coding
relies on the fact that there are low and high threshold auditory nerve fibres
low threshold fibres discriminate quiet and moderate sounds
- discriminate between low and medium amplitude sounds
high threshold fibres kick in to discriminate moderate and loud sounds
- discriminate between moderate and high amplitude sounds
loudness of sound = total neural activity
loud sound means all fibres are responding, quiet sounds mean only low responding
auditory pathway
starts with cochlear
sent through auditory nerve to cochlear nucleus
- acts as a relay station
- send neural activity to other nuclei in brainstem
then travels through superior olive
- analyses location, where sounds are coming from
- relies on precise timing so happens early
then to inferior colliculus and medial geniculate
- analyse pitch
- relies on fairly precise timing, not quite as early as location processing
ends in primary auditory area
- analyses higher order features, such as timbre
- less reliant on precise timing
auditory pathway simple
cochlear nucleus
superior olive
inferior colliculus
medial geniculate
primary auditory cortex
azimuthal plane
whether sound is coming from the left or right
intramural timing difference
sound from one side reaches that side first
eg left reaches left first
allows brain to detect direction sound is coming from
brain very sensitive to these small differences in time
interaural level difference
if sound coming from left, right is shielded by the head
sound arrives at far ear later
arrives there quieter than other side
when sound directly from side, ILD depends on the frequency
effect of frequency on ITD
to use ITDs, need to be able to match specific peaks in sound waves across both ears
at higher pitch = shorter wavelength
harder for brain to determine whether left or right came first
misinterprets where peaks are relative to each other
ITDs ambiguous or misleading at higher frequencies
effect of frequency on ILD
sounds diffract and bends around object smaller than its wavelength
sound blocked by larger objects
low frequencies diffract around the head
- some sound reaches far ear
- got there later but at a similar frequency
high frequencies do not diffract
- creates a head shadow
- limited sound at the far ear
- creates a larger ILD
ILDs only useful at higher frequencies
how is elevation of sound detected?
ITDs and ILDs donβt help
pinnae do
amplify some frequencies and reduce others
create spectral cues by changing incoming frequency spectrum
change the timbre
complex waveforms represented as different component frequencies
represented on a spectrum
when sounds come from different heights
shape of ears filter sounds in different ways
- creates different timbres depending on elevation of noise
= unique resonance patterns according to height
sounds from the front vs back
sound from directly in front or behind = ITD of 0
- takes just as long for sound to get to ears from front as back
- equally as loud
pinnae create small level differences
we can also rotate our heads
neural coincidence model
axons transmitting electrical signals representing sound from both ears
- acts as delay lines
axons connected to neurons
- act as neural coincidence detectors
means neurons only fire when stimulated at same time by information from left and right ear
measuring using ITD tuning curves
look at different neurons and measure firing rate across different ITDs
curve supports this model for some animals
- terrestrial more likely to use opponent process analysis
how does the neural coincidence model work to determine angle of sound?
sound waves from straight ahead come in from both ears
travels along until both left and right ears reach and feed into same neuron at same time - then fires
this specific neuron during allows the brain to know that the sound in coming from directly in front
if sound comes from right, hits right ear first
signal comes in from right before left
left stimulated later than right
sounds reach neuron together causing it to fire
different specific neruon fired - allows brain to determine that it is from an angle
opponent process analysis
involves 2 sets of broadly tuned neurons
neurons tuned to left half of auditory space - in right hemisphere
neurons tuned to right half of auditory space - in left hemisphere
system calculates the difference between the two sets to work out where a sound is coming from
- if most firing from left half, assume sound from left and vice versa
- smaller differences mean closer to centre
- no difference, assume sound straight infront
how are speech sounds made?
vibrations from the larynx travel upwards through vocal tract
spectrum of sound is shaped by the articulators including:
- soft palate, hard palate, tongue, teeth and lips
and the resonant spaces (resonators) including:
- chest, throat, mouth and nasal cavities
larynx (voice box) contains folds of material
air forced from lungs through fold, making them vibrate
- vibrations create sounds
speech perception relies on more than the acoustic signal
also reels on both top down (cognitive) and bottom up (perceptual) factor
fricatives
sounds made by forcing air through narrow gap in articulators
eg s, sh, z
a lot of energy through lots of frequencies
but particularly in high frequencies
shown by dark bands
vowel sounds
articulators can be used to shape our vocal tracts
creates vowel sounds
shown by stripy bands at lower frequencies
peaks in spectrum
- frequency components in vowel sounds that are particularly strong
coarticulation
the same sound is actually different depending on the acoustic context (neighbouring sounds)
however we perceive these as being the same
known as perceptual constancy
reflects to down contribution of articulatory knowledge
know what is needed to make these sounds so can remove small perceptual difference
McGurk effect
perceptual illusion
visually presented sounds affect perception of the acoustic signal
use of visual information inform interpretation of articulatory knowledge
linguistic knowledge
our perception of speech sounds is affected by the meaning of the context
phoneme restoration
if a familiar word is distorted (eg a missing sound) we put it back in without realising
influence of lexical knowledge
Miller and Isard
perception of speech sounds is affected by the meaning od the context
normal sentences
grammatical string of words
played people these sentences and asked to repeat them back
- all sentences should have been equally intelligible
but intelligibility decreased as stimuli became less meaningful or grammatical correct
shows influence of semantic and syntactic knowledge
- regardless of quality of acoustic inputs
- contextual knowledge of linguistic factors essential for perception
sine wave speech
formants replaced with pure tones
tracking the intensity modulations of those formants over time
sine wave speech can be learnt
has no fricative sounds etc only formants
but due to formants and coarticulations can be learnt
(formants influence perception of surrounding speech)
shows influence of phonological, lexical and syntactic and semantic knowledge