Lecture 21 Flashcards
location
sounds that come to the same place are assumed to belong together
onset time
sounds that appear at the same time tend to belong to the same object
similarity of timbre and pitch
sounds with similar timbre and pitch belong to the same object
help us break up an auditory scene: breaking up into separate streams
bach invention: two different melodies at different octaves and the notes in the left hand/right hand were more similar in pitch
proximity in time
sounds that appear in rapid succession belong to the same object
bach invention: trilling two notes those two became its own auditory stream
auditory continuity
if a sound of the same pitch or timbre changes in a constant fashion (even if interrupted) we think it belongs to the same object
effect of past experience
like gestalt familiarity
strong top down cue, what you expect based on what you’ve heard before
if you’ve experienced a pattern of sounds before then you will attribute it to the object it came from in the past
familiar with piece or composer you know what to expect: what transition is coming up or how the two diff streams interact
experiment by dowling and harwood
melody schema
Melody “Three Blind Mice” is played with notes alternating between octaves
- Listeners find it difficult to identify the song
- But after they hear the normal melody, they can then ‘hear’ (recognize) it in the modified version using melody schema
once you’ve activated this “melody schema” then you’re interpretation of the same information changes
Auditory space
surrounds an observer and exists wherever there
are sound sources.
Researchers study how sounds are localized in space along three dimensions:
– Azimuth - position left to right
– Elevation - position up and down
– Distance from observer
azimuth
horizontal left to right
elevation
position up and down
distance
distance from observer
figure-ground
trilling becomes the ground
main melody becomes the figure
problem with auditory localization
the signal is being mixed right at the receiving organ
no separation
Location cues are not contained in the
…. receptor cells (as on the retina).
location for sounds MUST BE
CALCULATED
On average, people can localize sounds which are:
– Directly in front of them most accurately.
– To the sides and behind their heads least accurately.
auditory localization cues
- binaural
- monaural
Binaural cues
location cues based on the comparison
of the signals received by the left and right ears
azimuth
two main cues:
• Interaural time difference (ITD), useful for low frequencies
• Interaural level difference (ILD), useful for high frequencies
Interaural time difference (ITD)
the amount of time it takes that sound stimulus to get to one ear compared to the other ear
Interaural level difference (ILD)
the amplitude of that signal hitting one ear vs. the amplitude hitting the other ear
monaural
using 1 ear
elevation
Spectral cues and the head-related transfer function (HRTF)
spectral cues
using frequency information in the signal
the information for location comes from the spectrum of frequencies
Interaural time difference (ITD)
maximally effective when something is off to the side
- difference between the times at which sounds reach the two ears.
• When distance to each ear is the same, there are no differences in time (point A).
• When the source is to the side of the observer, the
times will differ (point B).
• This cue is better for LOW FREQUENCY tones (< 800 Hz).
− May use temporal coding (phase locking - diff neurons responding to the peaks of a frequency coming in, as you get higher and higher frequencies the phase coding becomes more difficult, so we want to use simpler sounds that are less complicated ) in cochlea.
Interaural level difference (ILD):
it’s the diff in the sound pressure (amplitude) of a stimulus hitting one ear vs. another
as a sound stimulus is coming in you get a reduction in the intensity as the sound stimulus goes through your head
difference in sound pressure level reaching the two ears
• Reduction in intensity occurs for high frequency sounds for the far ear.
– The head casts an acoustic shadow, blocking out (reflecting) and absorbing some of the high-frequency pressure wave.
• This effect doesn’t occur for low frequency sounds. Works best above 800 Hz.
the sound pressure waves interfered with by the head
acoustic shadow
blocking out (reflecting) and absorbing some of the high-frequency pressure wave.
Interaural level difference (ILD) is largest at locations….
…farther to the side.
– This is shown in psychophysics experiments where they measure the frequencies hitting the ears.
Limitations of ITL & ILD
they are not effective for detecting differences in elevation.
• An additional cue is necessary for reliable elevation judgments (monaural cues).
ILD and ITD provide useful cues for judging locations along most of the….
…azimuth plane.
cone of confusion
– Consider points A and B. The level and time differences of signals reaching each ear is zero.
– There is a similar ambiguity at points C and D.
– These points are said to lie on a cone of confusion.
– Multiple (infinite!) cones of confusion are possible.
all the points are the same distance and can have the same shadowing effect: not getting any time or level differences
the pinna
is a spectral filter
when sound comes in it bounces off them and the frequencies are affected
the head-related transfer function (HRTF)
what kind of cue?
unique to each head
how the shape of the pinna and the head uniquely affect the intensities of frequencies
– This is a spectral cue since the information for location comes from the spectrum of frequencies.
pinna exp.
Measurements have been performed by placing small microphones in ears and comparing the intensities of frequencies with those at the sound source.
“what it’s like to sound like you”
Hoffman et al. (1998) psychophysics experiment
investigating spectral cues
an example of what else?
what was unaffected? what does that tell us?
manipulated the pinna
– Listeners’ accuracy was measured while locating
sounds differing in elevation (baseline).
– They were then fitted with an ear mold that changed the shape of their pinna.
– If the shape of the pinna is important for elevation
judgments, subjects should show a large decrease in
performance (compared to baseline)…
– Right after the molds were inserted, performance was poor for elevation but
was unaffected for azimuth (spectral cues not important for left and right).
– After 19 days (they kept the molds in this whole time), performance for elevation was close to original performance.
– Once the molds were removed, performance remained high (didn’t go back to some baseline where they had to re-learn).
– This suggests that there might be two different sets of neurons—one for each
set of spectral cues (new and old) = a sparse code OR a more broadly distributed population code that could adjust and encode for both.
– Nice example of experience-dependent
plasticity.
Which auditory cue would be most useful for locating a low frequency tone coming from the top of the screen?
it’s a low frequency but where you’re sitting in the room changes whether it’s an ITD or HRTF (based on the altitude one is sitting in the room)
Why are we best at localizing sounds in front of us?
– It appears that ITD and ILD azimuth cues are most effective to the side, but we have difficulty detecting those on the cone of confusion.
– Remember that perception is an active process!! No single observation is used to localize a sound. We orient to make ITD and ILD zero in front of us (till you’re staring right at the object).
– Spectral cues (HRTF) are always used: very good when straight on cause that’s what we have the most experience with.
– We also use expectations (top-down information) to help localize.
auditory localization cues: binaural and monaural. They work together effectively, but…
are mainly for orienting.
In real applications, we move our heads and take many ‘auditory samples’ with all localization cues.
Physiological representation of auditory space
Two basic approaches have been proposed (using ITD as a model)
- Narrowly tuned ITD neurons
- Broadly tuned ITD neurons
Narrowly tuned ITD neurons
neurons responsive to small changes
respond to small differences in timing
• Respond to specific time differences only. One neuron gives a location. This is a form of specificity coding.
Broadly tuned ITD neurons
lots of neurons responding to large changes in timing differences
• Respond to broad range of time differences. Location
is calculated from a range of neural responses. This is
a form of distributed coding.
Jeffress Model for narrowly tuned ITD neurons
how the system is sensitive to small changes
– These neurons receive signals from both ears.
– Coincidence detectors fire only when signals arrive from both ears simultaneously (sensitive to the different when the sound hits one ear and the other).
– If signal reach both ears at the same time, then ITD is zero.
– If a signal comes in on the right earlier, it will travel father along these cells and meet the left signal
at a different location .