Unit 4 Flashcards
Temporal patterns are informational ____
Substrate
What are the 2 aspects of temporal processing?
- Temporal resolutoin/acuity
- Temporal integration
What is temporal resolution/acuity?
- How to follow the temporal changes
- Ability to follow quick change, mostly the envelop of the sound
What is temporal integration?
- Increasing sensitivity by integrating information over a long duration
- Integrate information to improve hearing
Temporal patterns are very important which carry ____
Information
The temporal envelope of speech is resulted from ____, in addition to the amplitude changes of pronunciation of vowels, constants.
Speech on and off
Explain the temporal envelope
How we quickly follow change of signal in a time line (not in fine structure, but in an envelope)
What is the envelope frequency?
Range from few to several hundred Hz in speech
What is the spectrum of the envelope? What does it correspond to?
Peaked at 3-4 Hz, corresponding to the speed of words/sec.
What do vowels contain?
Vowels contains Fn (n=0, 1, 2, 3 for fundamental frequency (F0) and formants)
The interaction among the formants produces ____
Temporal fluctuation
Therefore, speech (esp. vowels) can be mimicked by ____
Amplitude/frequency modulation
How many peaks does speech have in frequency?
- 0-4 peaks
- 0 is the FF of speech
- F1, 2, 3 = formants (they are separated across the frequency range of speech); interaction among them creates modulation (amplifying signal over time = creates the temporal pattern; speech)
The ____ is more prominent than the ____ in speech
Envelope, fine structures
With music, the ____ is more prominent than the ____
Fine structure, envelope
The ____ and ____ are both important for sound
Envelope. fine structure
What are two types of temporal resolution?
Within- or cross-channel resolution (Within channel is closer to real life, but in some studies, we use cross channel)
What is an example of within or cross channel resolution?
Example, gap markers in the same (within) or different (cross-channel) frequency bands
Peripheral versus Central limitation
- Limitation from synaptic transmission, bottom-up
- Peripheral is bottom up
- Limitation due to the need of top-down process
- Central is top down
More ____ = more time delay and poor temporal resolution
Synapse
Simple Estimates of Within-Channel Acuity
- what numbers do you use
- what is the minimal click interval
- what does it give you a good estimation of
- Use a single number (index) to indicate temporal resolution
- Present clicks presented in sequence, Minimal click interval is ~ 6ms.
- This is a good estimation of auditory temporal resolution
Explain why the minimal click interval is ~6ms?
Equal intervals (decrease interval to find the points where subject cannot detect separation & hear continuous sound - around 6ms; we can hear the clicks larger than 6ms, 160 clicks/sec; above that we hear clicks)
Temporal resolution to click trains (6)
- how to present clicks
- how does the rate of the train change
- how long does the sense of separated clicks remain
- what is the approximate of temporal res with click trains
- what shows a similar result
- Clicks presented in train
- The rate of the train is increased from low to high
- The sense of separated clicks remains up to the rate of 150/s or 160/s
- Or 6 ms is the approximate of the temporal resolution with click trains.
- Similar result is seen using tone burst of 4 kHz, in which the resolution was evaluated as the minimal intervals between the tones.
What are the methods for evaluating temporal resolution? (5)
- Paired clicks
- Duration discrimination
- Gap detection
- Amplitude modulated noise
- Neural coding
What if we use paired clicks?
- Two pairs of clicks
- In first pair, the first click is louder/weaker than the second one
- The order is reversed in the second pair (or equal)
- Two pairs are identical in frequency spectrum (different in amplitude)
- Listeners differentiate them by detecting temporal order
- Resolution is indicated by the minimal interval upon which the order can be told correctly
If we use paired clicks, what does temporal resolution go down to?
2 ms
Each interval presents a pair of clicks
Using clicks, the temporal resolution is in the range of ____
2-6ms
Binaural testing, ____ is much smaller
Temporal resolution
Effect of overall duration on Discrimination of signal duration (WF)
- The Weber’s fraction should be constant (or Weber’s law is well followed).
- No difference related to bandwidth of signals
How does duration discrimination work?
- Ask the subject to tell which signal is longer or shorter (only changing with duration, same signal)
- The duration threshold is presented against the baseline duration (linear scale)
- From this data we can predict that WL is largely followed in duration discrimination test
What is gap detection?
Ability to identify a (silent) gap between two sounds or an interruption of a sound by varied formats
What is the gap?
Gap can be a silent period or one in which sound intensity is largely reduced
What is the gap threshold?
- Gap threshold (_t) is defined as minimal period of gap that can be identified.
- Below that, the subject tells the sound to be continuous
How is gap detection measured?
Gap detection can be measured in behavior test, or in objective test like evoked potential
Explain the gap markers
- Before the gap you have a pre-gap marker (first signal) and after the gap there is a post-gap marker (second signal)
- The signal can be a tone burst or a noise burst (noise burst is better)
Explain gap markers
- what noise is used
- what type of noise can contaminate frequency cues?
- Can use broadband noise and narrow band signals
- Contamination of Frequency cues (if narrow band signal is used)
- An issue when using narrow band signal and sudden on/off
How to overcome the frequency cues? (3)
- Masking with notch noise: can also splatter when turning masker on and off
- Bandpass filter to get rig of contamination: cannot always eliminate contamination
- Ramping: making it difficult to define gap duration - causes the gap to be unclear
Can notch noise have frequency splattering?
Sometimes the notch noise can have frequency splattering
What sound has splattering?
When you use tone burst, you need to think about the frequency splattering at onset and offset (provides frequency cues, not temporal cues), so you are unable to tell the temporal resolution as it is contaminated
Can bandpass filter always eliminate contamination?
No
Gap detection is based upon sensation change around gap
- what can the pre and post gap marker be different in and what does this cause
- Pre- and post-gap marker can be different in terms of amplitude, duration and frequency.
- If different in frequency, then tests cross-channel
- If both are same in frequency, within channel test.
Explain how we detect a gap (and how that happens)
- The sensation change is due to the onset and offset of signal
- When the signal turns on, our sensation takes time
- When the signal turns off, our sensation of sound gradually goes down (it takes time)
- If the gap is short enough, the sensation will not start from zero, but from the declined curve (whatever has not disappeared from the pre gap off)
- Then, the sensation takes time to reach plateau
Explain delta s and delta t with gap detection
- Delta s depends upon the gap
- If delta s is smaller, the off-set marker (in the second tone), will continue to to get higher up on the first tone
- Shorter delta t, delta s is reduced (and eventually become 0) and the second tone will not be sensed (sensed as only one tone)
Effect of intensity: equal marker intensity
- The impact of sound level is seen near hearing threshold
- Not changed by sound level well above threshold
Examining the effect of gap marker bandwidth
- Gap marked with equal sounds
- Markers: broadband signals and narrow signals
- broader the masker BW, the better the gap detection threshold (smaller)
Our system can integrate information across broad frequency region to improve ____
Temporal resolution
What is a very influential aspect of gap detection?
Gap marker BW
smaller thresholds with broader ____
Bandwidth
The broader the bandwidth of a gap marker, the lower the ____
Threshold
The impact of hearing loss on gap threshold
- High-fre HL: deteriorated gap threshold when using broadband markers
- Attributed to the reduced audibility
- Evidence: When high-pass masking is used in normal hearing subjects, similar changes were seen
- High frequency channels have better temporal resolution
Gap detection threshold goes down with ____
Hearing loss (especially SNHL)
SNHL and fake HL
- High pass filter just masks high frequency region, creating a fake hearing loss (if you compare artificial hearing loss and real hearing loss, there is no difference)
- When you have HF HL, you naturally reduce the bandwidth of gap marker (if you use a broadband signal, the HFs will be useless due to HL)
- This is why individuals with HF HL have poor temporal resolution
Gap detection in different setting (3)
(a): within channel (same frequency band)
(b) /(c) between channels (different)
(c): diff in onset (must be “between channels”)
Detection of Sinusoidally Amplitude Modulated Noise - modulation depth
- % modulation:
- Average amplitude/p-p% (peak to peak);
- or p(peak)-t(trough)/average%
dB: 20log(%), e.g., 10%~ 20log(0.1) = -20 dB
Detection of Sinusoidally Amplitude Modulated Noise - minimal depth of modulation
Minimal depth of modulation (that subject can detect)—detection threshold with modulation frequency—Modulation transfer function (MTF)
What are the 3 different ways to modulate noise?
- modulated vs. unmodulated
- modulation with different MF
- modulation with different depth
What is the modulation transfer function (MTF)?
The MTF addresses the ability to detect the presence of amplitude modulation in a sound
Explain what type of function MTF is
- normal
- temporal processing deficits
- The MTF is typically low-pass function: larger modulation depth threshold at larger modulation frequency (MF).
- In subjects with temporal processing deficits, larger modulation depth threshold is seen at high MF.
Neural Coding for ____ Analysis
time
Temporal Processing in Cochlea
- Phase locking or synchronization
- Envelope coding
Demonstration of envelope coding
- PSTH
- ISIH
- PRH
- These require repeated stimulation
How are neurons really coded in your brain?
Image the real neuronal envelope coding by Volley principle (this is what really happens in your brain)
Example of ISIH
- Inter spike interval histogram
- If phase locking is perfect, each individual auditory nerve will produce 1 spike per period
- When the frequency of signal is high, the auditory nerve cannot follow that (can’t see a clear interval or period)
- By chance, we are likely to see the interval the same as 1 period (1 spike per period)
If the AN skips one period, you will see interval of 2 or 3 period - This is how we show periodicity of response
Synchronization in AN firing
- Increasing sound level causes better sound locking (the spikes are more likely to occur at a certain phase)
- Distribution becomes narrower and narrower with better phase locking
Summary of temporal coding by auditory nerve
- how do ANs encode temporal information of sound?
- how is phase locking established?
- Auditory nerves encode temporal information of sound by phase locking
- Phase-locking is established by integration of responses from many neurons.
In reality, do we need to repeat stimuli many times to detect sound?
In reality, we don’t need to repeat stimuli many times to detect sound (volley principle)
Modulation Transfer function of Single Neurons
- Modulation transfer function typically shows low pass function
- However, by a single neuron, this is different (bandpass transfer function)
MTFs of IC single neurons: different best MF
- Results show a very sharp MTF for each individual neurons
- The peak points are the best modulation frequency
- For each neuron, there is a typical best modulation frequency (can detect temporal modulation better at a specific frequency)
Majority of neurons have best modulation between ____ (results from IC)
30-100 Hz
Behaviour modulation transfer function is ____ pattern, single neuron modulation transfer function is ____
Low pass, band pass
Place code for best modulation frequency
- Concentric distribution of neurons with the same BMF
- The neurons on the surface of the cone show the same BMF
The iso-BMF surface is in a cone shape taping to dorsal side (low frequency) - Another example of place code in auditory processing
In the central auditory cortex, neurons with the same CF are distributed on a ____
Flat plane
Neurons with the same best modulation frequency in the IC are distributed in a ____
Cone shape
Neurons response follows the ____ of the signal
Envelope
Masking Pure Tones with White Noise: increased masked threshold with frequency
- 10 dB/decode = 3 dB/octave
- Masking with white noise is more effective on the high frequency side due to white noise density (it will mask more)
Signal(energy)/Noise (spectrum) level required for detection
- For a higher fre. signal, we need a larger SNR to hear.
- Higher masking effect.
- Remember the spectrum of the noise is flat.
Concept of Sound Density
- White noise has equal density across frequency
- Density = power in unit frequency range.
Total power in a frequency range = density * delta f - The energy/power effective for masking exists in critical band: P = density*CB
Concept of Critical Band (CB)
- For a particular signal, only the energy in a certain band around the frequency of this signal impacts the hearing of this signal. This band is called as critical band
- It can also be defined as the frequency spectrum that one neuron will respond to.
- Therefore, in a broad band masker, only the energy in CB will produce masking.
Effective masking and critical band
- Noise energy beyond CB is not useful
- Only energy inside CB will produce masking
The width of Critical band changes with ____
CF
Effective masking increases with frequency because the CB increases with ____
Frequency
Higher the frequency, larger the ____
CB
Effective Masker: bandwidth consideration
- Only energy within CB is effective
- CB increases with CF in linear scale
- But keep constant in ratio scale: 20% or 1/3 octave around CF
- Therefore, the masker for pure tone should be narrowband noise of 1/3 octave.
What type of masker is used in clinic?
Narrowband masker (to reduce the total level of masking so it is more acceptable by client)
When ____ is used, the effective masker level actually increases with ____
white noise, CF
Therefore masked threshold increases with ____
CF
Measuring CB with masking
- Keep the total intensity of the masker the same
- Increase bandwidth of the masker from zero
- Within CB, masked threshold should be? Maintained
- When beyond CB, masked threshold will be? Decreased (because some energy gets lost, and threshold goes down)
- The turning point tells CB.
What happens to the masked threshold within the CB?
Threshold won’t change
What happens to the masked threshold beyond CB?
- Beyond CB, the masked threshold will decrease, because some masker energy got into other channel so that is not effective.
- Outside the CB, energy from the masker is useless (only energy inside the CB is useful)
How do we need to change the level of the masker to have the signal just masked?
-within CB
-beyond CB
- Within CB: masker level should not be changed.
- Beyond CB: masker level should be increased
Test hearing threshold while increasing signal bandwidth within CB. What will happen?
The threshold will not change, energy is in the CB
However, if the signal frequency range is beyond CB, then what will happen?
Beyond CB, the power thins out so there is not enough energy to evoke a response within CB. Need to boost up the total sound level to increase threshold.
Co-modulation masking release
-signal band vs. flanking band
- Signal band: band around signal
- Flanking band: band far apart from the signal band
- Flanking bands does not change masked threshold because they are far away from CB
Comodulated vs. Uncomodulated
- Comodulated: see release or decrease in masked threshold
- Uncomodulated: no change in masked threshold
Does co-modulation change the CB?
Does not change the CB
Adding of more maskers in flanking band reduces ____, if co-modulated
Masking in the signal band
Overshoot
- Masking effect depends on the time relationship for signal in masker
- Larger masking when the signal is close to the onset of masker
- Up to 10-15 dB
- Disappeared when delay (onset of masker-onset of signal)> 200 ms
Masking effect is larger when signal is closer to the ____ of the masker
Onset
When the signal moves away from onset, masking effect ____
Reduces and plateaus (>200ms)
Temporal masking
The masker can be presented after signal (backward masking) or before (forward masking), or combined
Monotic, dichotic, and diotic
- Monotic = signal and masker go to the same ear
- Dichotic = signal goes to one ear, masker goes to other ear (no interaction between masker and signal in cochlea)
- Masking occurs in the brain
- Diotic = real life
What hypothesis does forward masking use?
Forward masking uses the line busy hypothesis (the vibration produced by the masker makes the cochlear partially occupied (this occupance declines with time after offset); this is why masking effect goes down with time (this doesn’t happen in backward masking because the signal occurs earlier`
Where does the largest masking occur in forward masking?
Forward masking = largest masking occurs closer to the offset of the masker
- Masking effect reduces with time as the cochlea occupation goes down
Where does the largest masking occur in backward masking?
Backward masking = largest masking occurs closer to the onset of the masker
Mechanisms for temporal masking
- Forward masking is relatively clear:
- Overlap in BM vibration,
- Neural adaptation,
- Central masking (indicated by cochlear implant)
- Backward masking: not sure if there is a central role
Similarities between central and peripheral masking
Difference
- Masker to contralateral ear
- Similarities between central and peripheral masking:
- Frequency relationship
- The masking effect and time-relationship between masker and signal
- Difference: much smaller threshold shift in central masking
Is there interaction between the masker and cochlea in central masking?
No
Closer the masker and signal = larger the ____
Masking effect
Is the central masking a large effect?
Central masking causes a smaller threshold shift (the masking effect is not as larger for central masking as it is for peripheral masking)
Central masking is ____
Dichotic
Peripheral masking ____
Monotic
Masking effect is stronger at the ____ of the masker
Onset or offset
Informational masking
- Interaction between masker and signal at higher level of auditory pathway
- No overlap between masker and signal
- Opposite to peripheral masking (in cochlea), which is also called energetic masking
- Depends upon the overlap of the masker and the signal
- Targeted tone (or speech) in the presence of multi-frequency masker (similarity and uncertainty impacts performance)
- Test masked threshold in 2IFC
- Subject chooses which one contains a signal
- There is a central component because it relies on context
- Frequencies of the masker randomized
- CB around targeted signal is “protected”—to avoid energetic masking
Central masking = max masking effect of ____ dB
15
Informational masking = max masking effect is ____ dB
30
Informational masking is used in ____
2IFC
The informational masking of ____dB is typically larger than the effect of central masking (in dichotic presentation)
30
____ has a smaller masking effect than ____
Informational masking, energetic masking
Informational masking by uncertainty
- The effect due to randomization
- Larger the randomization, larger the effect
- Difference becomes smaller with increasing number of components in the masker
More uncertainty with ____
Less components (harder to hear the signal)
More certainty with ____
More components (easier to hear the signal)
Larger the randomization, larger the effect of ____
Masker (harder to hear the tone)
Why do we need to know the masking and the mechanisms? (4)
- To understand how masking changes our hearing.
- To use masking as research tools.
- Notice the gaps between what we have discussed and what we need for the signal detection under masking.
- Further learning is required.
How do we detect signals in noisy background? (4)
- Spatial filtering: Detect signals by differentiating the source from noise—depending on binaural process.
- Spectral filtering/frequency selectivity: selectively filter out noise—but won’t work if the spectrum of noise is largely overlapped with that of signals.
- Temporal filtering: distinguish signals based upon the time difference, e.g., signals in the trough of noise.
- Cognitive processing: Detect signals by using experiences (familiarity to the signals)—depending on top-down process. This is shown in part of “attentional filtering”.
- At a party you recognize familiar voices
The neuro-mechanisms contributing to signal detection in noise
- The function of binaural processing—related to temporal processing, inhibition, efferent etc; to spatial filtering.
- The role of inhibition to spectral filtering and other process.
- The role and the mechanisms of high temporal resolution in the auditory system.
- The role of low-SR ANFs and efferent control of them on noise resistance in hearing.
- The interaction between ascending and descending pathways.
- The role of cochlear efferent control—the masking release effect.
- Cognition and selective attention
- This changes with HL and age
Two methods considerations for frequency and pitch
Pulsed signals and frequency modulation (FM)
How to reduce frequency splattering at on/off for a pulsed signals (gated pedestal)?
- Use (slow) ramp
- Masking (such as notch noise)
Cannot present two tones simultaneously for frequency discrimination, because of ____. Therefore, ____ is not useful.
pitch fusion, continuous pedestal
Using FM, the frequency discrimination limen increases with ____
Baseline frequency
What is ramping?
Slowly increase volume to turn on and slowly decrease volume to turn off
Fre. Discrimination Limen
Discrimination limen is the smallest change in frequency that you can detect
DL - what if the change in frequency is below 500Hz?
When your change in frequency is below 500 Hz, the DL is constant (easier to detect a difference in frequency at low frequency, below 500 Hz)
DL - what if the change in frequency is above 500Hz?
Above 500 Hz, DL will increase with frequency (larger difference to notice a difference)
Discrimination limen is best at ____ (for FM)
Low frequency
Weber’s fraction obtained with FM signals
- Slight decrease w/ intensity at high frequnency, much larger at low frequency
- Weber’s law: correct above 500 Hz
- WF = 0.7% = 0.007
- Below 500 Hz, delta F doesn’t change, but above 500 Hz, it does (pure tone)
Using FM, the frequency DL will increase with the ____
Baseline frequency
The JND in frequency DL gets bigger with ____
Increasing frequency (its getting worse)
Results from non-FM signals (gated pedestal of tones)
- Pulsed tone, or band noise with different cutoff
- deltaf and deltaf/f are smaller than FM by factor of 3
- WF: 0.2% (versus 0.7% for FM)
- Level dependent at low SL
Using FM method WF is ____ as using pulsed signal. What range is this difference applied too?
- 3x larger
- This difference is mainly applied to the low-middle frequency up to 2-3K, above that FM is better than pulsed signal
Pulsed better at ____, FM better at ____
Low frequency, high frequency
What are the results of gated pedestal?
- where is the smallest delta f?
- When the signal level is way above the threshold, the impact of level becomes smaller
- The smallest delta F is around 1 Hz
- Sound level way above threshold, we can discriminate the frequency as small as 1 Hz
For gated pedestal, uncertainty at ____ and ____
Low SL, low frequency
For gated pedestal, at ____, more of a sound level impact
Low frequency
Explain increasing the sound level at low vs high frequencies (for gated pedestal)
At LF, increasing the sound level improves discrimination performance, but there is no more improvement well above threshold. This improvement does not have as large of an effect for HF.
Impact of the interval
- In intensity discrimination, the interval between the two pulses impacts the performance: larger the interval, larger the discrimination threshold—decay of short memory
- In frequency discrimination, the increase of the interval improves the performance in a certain range.
- Likely due to the reduction of pitch fusion with increasing interval.
- For longer interval, performance will go down.
When the interval is very small, there is a chance that the two tones will fuse together to make ____
One pitch (this makes frequency discrimination difficult)
Comparison between pulsed signal and FM (2000 Hz)
- Poorer performance using FM in frequency < 2000 Hz
- Better performance using FM in frequency > 2000 Hz.
Post signal gives better performance by an effect of ____
3 (in the middle frequency region)
What is DLF?
DLF: difference limens for frequency, two tone pulses presented in sequence (two pairs), subjects indicate in which of the two successive pulses, the second pulse was higher in frequency.
What is DLC?
DLC: difference limens change: subjects indicates which pair differed in frequency in two successive pairs of two tone pulses (2IFC).
Overall, in low frequency region (<2k), ____ result better performance in frequency discrimination
Pulsed signals
Effect of Stimulus Duration: Weber’s fraction is reduced with ____
Increasing duration
The effect of increasing duration is a ____ effect on frequency discrimination
Temporal summation
Beyond ____, you will not see better frequency discrimination with duration
200ms
Increase the BW within CB; density of the sound will ____
Decrease
However, this change in density will not change the threshold, if ____
BW is in CB
Signal bandwidth within CB, threshold will ____
Not change
Signal bandwidth beyond CB…
- Signal will leak to other bands and the signal will be wasted (it is below threshold and useless); therefore, the energy inside the CB becomes smaller and the threshold is lower (sound will not be heard)
- Bandwidth beyond the CB, we need to increase the sound level, therefore increasing threshold
- This is threshold testing
Define CB by masking procedure
Only the energy of a masker in CB around probe tone makes contribution to masking
Only energy around the CB produces ____
Effective masking
Loudness sensation - When BW<CB, loudness will?
Within CB, sense of loudness is the same
Loudness sensation - When BW> CB, loudness will?
Beyond CB, sense of loudness becomes louder (stimulating more of the cochlea)
Define CB by AR
- When the signal goes beyond CB, more auditory channels are activated and sound is louder (AR will be stronger) and threshold goes down.
- When we test absolute threshold, the energy that goes to other bands is wasted and the threshold goes up (because AR is way above threshold)
Why the impact of BW on the two threshold (one for hearing sensitivity and the other for AR) is different?
- Hearing threshold goes up with BW
- The AR threshold goes down with BW
What is the lowest CB?
80Hz
CB= ____ Hz to CF up to 500 Hz
100
Beyond 500 Hz, CB = ____, or ____ octave
20%CF, 1/3
The width of one CB should be around ____
100Hz
CB = 20% of CF, therefore, each CB contains 100 JDDF if…. ____
Below 500 Hz
Unit for critical band: ____
Bark
Critical band rate scale: broad spectrum into ____
CBs
1st: from ____, 2nd from ____
0-100Hz, 100-200 Hz
From 0-16000Hz, we have ____ abutting CBs
24
6.3x4 = 25.2, in each CB, there are ____ just detectable steps of frequency change
25
The fre. change for ____ length of BM
0.2 mm
From 0-16000Hz, we generally have ____ steps
600
In totally, we have ____ IHCs in each ear
~3600
We have ____ IHCs in each step
6
Why is having 6 IHCs in each step redundant?
If you have 1 IHC in each frequency band, you are okay (as long as the OHCs are working) therefore, the number of IHCs are not huge (so having 6 IHCs in each step is redundant)
Neurological basis for frequency discrimination (how does our system analyze frequency?
- Place code starting from ANFs, auditory channels with frequency selectivity, inherited in CAS
- OHCs (active amplification) increases the frequency selectivity of ANFs.
- Temporal coding enhances frequency coding in cochlea.
- Efferent control to cochlea enhance frequency selectivity.
- Central inhibition enhances frequency selectivity by “masking” the response at edges, enhancing contrast.
- Without central inhibition, there will be very broad tuning
Sound induces pitch only exists when it is ____
Heard
2 frequencies that are too close (but > JDD) may produce ____
Identical pitch (pitch fusion)
Unresolved vs. resolved pitch
- Resolved = two signals that can be differentiated based upon their activation of auditory channels (they can produce distinguishable vibration upon the cochlea)
- If the two signals are in two different CBs (widely different from each other) this is resolved; two different representations in cochlea
- Unresolved = no corresponding vibration in cochlea, or the vibration produced by two signal cannot be differentiated (to close to each other and fall into the same CB)
Harmonic vs. Non-harmonic pitch
- Harmonic: frequencies can be divided by the same integer number
- Non-harmonic: components do not have that relationship, they are different in terms of generating pitch
Entities vs partials
- Entities: overall impression about the whole sound (music at a concert); we appreciate the overall pitch produced by the instruments
- Partials: pitch represented by individual instruments (refers to the different frequency components of that instrument
Integration vs. segregation - synthetic vs. analytic hearing
- Integration: normally we get this (easy for everyone)
- Segregation: need good training for this (a conductor tells when one person makes a mistake)
Analytic pitch vs synthetic pitch
- Analytic pitch: can hear the different parts (segregation)
- Synthetic pitch: focus on the whole sound (integration)
Pitch vs timbre
- Pitch: relating to the frequency component
- Timbre: a concept that is not clearly defined, it is the quality of sound (takes into account temporal changes)
- Takes in condsideration the frequency components and the temporal changes of sound
Measurement of pitch
- Mel (stevens): 1000 mels by 1000Hz tone at 40 dB SL
- Double or half: 2000 mels or 500 mels
- No linear relationship with frequency
Explain the doubling of mels and why it isnt linear
- Starting at 1000Hz, to increase the mel to 2000Hz, you will need a frequency change of roughly 3 times to feel the doubling of mels
- Starting at slightly above 1000Hz (1200 Hz), you will need to increase the frequency more than 6 times to feel the doubling of mels
- This shows the relationship is not linear
____ mels is one step that is covered by 6 IHC
4.5
____ mel = a shift of 12 neurons or 0.8 IHCs
1
____ mel = one bark = 1.3 mm = 150 IHCs
100
100 mel covers a distance of ____ across the cochlea
1.3mm
100 mel = ____ CB
1
Effect of intensity on pitch
- Mid-frequency, no effect
- High-frequency, pitch increase with intensity
- Low-frequency, pitch decrease with intensity
____ is the main contributor of pitch perception
Frequency
Explain pitch above and below threshold
- When we hear sound way above threshold, we have clear pitch perception
- When sound is close to threshold, pitch is not clear
Pitch changes with ____
Intensity level (this is frequency dependent)
At high frequency, what do we have to do in order to maintain the same pitch?
High frequency (7000Hz) – in order to maintain the same pitch, we have to decrease the frequency (or else pitch will become louder)
At low frequency, what do we have to do in order to maintain the same pitch?
Low frequency, we have to increase frequency to maintain pitch (or else we will feel pitch decrease)
Effect of duration on pitch
- Very short duration (<3 ms): tone sounds like click
- > 3-4 ms or 6 cycles is required to have pitch sensation
- > 10 ms for f>1000, clear tonal sensation; improved over duration up to 250ms
- > 250 ms, stable pitch sensation
We need more than ____ or ____ to have a good sensation of pitch
3ms, 6 cycles
Effect of ramping
- explain
- what gives better tonal sensation
- Longer the rise/fall time, less frequency splattering
- But poorer in transient
- So, better tonal sensation with slow ramping, longer duration
We can’t explain periodic pitch, but ____ makes a contribution to pitch sensation
Periodicity
Periodic pitch: no need for ____
Place code
Explain why periodic pitch has no need for place code
- Missing fundamental: interaction across harmonics causing temporal fluctuation (periodically)
- Residual pitch is not processed in the cochlea (not by place code)
- Low frequency region of cochlea is not required in producing pitch
- The missing fundamental relies upon the CF
Broadband noise produces pitch sensation when ____ to produce periodicity.
Modulated (by low frequency signal)
Temporal pattern (or periodicity) produces pitch; carried on by neurons with ____.
CFs as carrier frequency
Missing fundamental and the shifting
- Missing fundamental is well explained by periodicity theory. 60-Hz shift breaks the rule of common denominator.
- Integrated pitch: does not require vibration in 200 Hz region
- Shifting the frequency of each component by 60 Hz, we maintain the interval as 200 Hz, cannot be divided by common denominator, but pitch shifts to the higher frequency slightly (due to shifting of periodicity)
- Because of the up shifting of 60Hz, I1 becomes shorter and pitch is increased (periodicity contributes to pitch perception)
Up shift of 60 Hz causes ____ of equivalent peak interval
Shortening
Two signal that have frequency very close to each other will produce ____
Beats (modulation frequency is the difference between the two tones); this is a way to produce amplitude modulation
Why do beats happen?
Two tones in phase and out of phase periodically
Pitch sensation of 2-tone combination
- When diff is small, beat is produced (smooth modulation: 400+410 Hz)
- When diff is widened (but <CB), roughness rather than discernible beats (400+440 Hz)
- Further widening (>CB): separated pitches (400 +600 Hz).
Understand periodicity
- where must periodicity be seen?
- how should harmonics be separated?
- what does the separation cause?
- what is the separation is larger than CB?
- Periodicity must be seen in cochlear, not only in the stimulation (in order to cause phase locking in ANF)
- Separation of harmonics must not be larger than the width of critical band (or should be unresolved)
- So that the two harmonics can interactive with each other, causing periodicity in the CB (unresolved).
- If the separation is larger than CB, you will hear separated pitchs.
Problems of periodicity theory (4)
- Patterson: phase change leads temporal pattern change, but not pitch
- Hall and Peters: Missing fundamental can be heard by sequential presentation of three harmonics (each 40 ms, interval 10 ms) in noise, but pitches of each harmonics in quiet
- Goldstein: pitch can be established by presenting different harmonics dichotically.
- Non-harmonic sounds can still produce pitch (such as dual-tone multi-frequency signal for telephone pads)
Physical Factors that Influence Pitch Perception
- Onset Time: two sets of partials have different onset, will be segregated as different partials
- Harmonic Partials—Principle of dominant component
3rd, 4th and 5th harmonic component are dominant (if > 10 dB SL)
Not fixed to harmonic number but to frequency range in which the sound is well resolved in cochlea
In speech, ____ harmonics are dominant
3, 4, and 5
____ contribute more to pitch perception
Middle frequencies
The effect of modulation and co-modulation
- what does modulation of one harmonic cause?
- what happens to the pitch when tones to each ear are commonly modulated?
- what happens to the harmonics when there is coherent modulation of all harmonic partials?
- what happens when you add more harmonics?
- Modulation of one harmonic component break down entity
- Fused pitch when tones to each ear commonly modulated
- Coherent modulation of all harmonic partials, harmonic remain
- Number of Harmonics: adding a new component increases the sense of entity
Breaking down the entity means you can…
Hear the other pitches
Present tones of different frequencies dichotically to both ears but modulate them of the same signals you will get ____
Integrated pitch
The more harmonics you add, fundamental pitch becomes ____
More clear
If the spectrum is smooth, you get a ____
Better sensation of pitch (if not smooth pitch perception is poor)
influencing factors of pitch (5)
- Tone Duration
- Sound Pressure Level
- Relative Phases: least important
- Spatial Origin (binaural hearing)
- Context Effects–Stream Segregation (see in lecture of binaural hearing)
Subjective Factors that Influence Pitch Perception
- Musical training
- People with good training in music have a stronger ability to identify partials
- Selective attention
Theories for pitch detection
- Spectral theory (two stages)
- 1 frequency analysis
- 2 pattern recognition (spectrum)
- Temporal theory
- Neither of those theories can account for all pitch perception
What is the spectral theory?
-Frequency analysis in the cochlea based upon place code
-Pattern recognition based upon spectrum (all ranges of hearing)
What is the temporal theory?
Phase locking, temporal relationship
What is a combined model for pitch perception?
Pattern perception model
Timbre (what are the 2 factors and what is the best example)
- Timbre is not clear, it is roughly emphasized by 2 factors: spectrum and dynamic characteristics
- Music is the best example of timbre (different instruments)
Factors of timbre
- Spectral factor—steady-state feature or tone color
- Dynamic characteristics (separating percussive from blown instrument)—the role of signal envelope (temporal pattern).
____ has high dynamic attributes and spectral attributes
Piano
Binaural Summations/Benefits
- Increase loudness
- Improvement in differential limen
- Better perception in noise: spatial filtering
- Binaural fusion and beats
Binaural hearing increases loudness by ____
~6 dB
Binaural improvement in diff limen. especially at what frequency?
Binaural hearing is better than unilateral in discrimination, esp at low sensation levels (SL)
Binaural advantage in intensity discrimination
The binaural benefit can’t be attribute to binaural summation on loudness because it would require more than 30 dB diff in loudness to produce such difference in discrimination.
Binaural advantage in frequency discrimination
Binaural is better
Binaural benefit in perception in noise
- who is it seen in?
- who is it reduced in?
- what 2 things give this benefit?
- Seen in normal hearing subjects
- Reduced in subject with aging and SNHL
- Big benefit w/ binaural hearing aids and cochlear implants
Potential Mechanisms for binaural benefits of hearing in noise (5)
- Separates target sound from noise (spatial filtering)
- Improves discrimination
- Improves stream tracking of target sound
- Unmasking (via efferent control and others)
- Reduced in aging and SNHL
Binaural fusion
- binaural differences in what 3 things
- explain a fused image
- Binaural cues for acoustic image in space: Binaural differences in intensity, spectrum, and timing
- Fused image: we do not feel that two ears work separately, but…
- Dichotical signals can be different or similar, but should be connected in certain ways
Commonality is required for binaural fusion
- what is commonality
- what is an example of commonality?
- Binaural fusion from two ears receiving similar signals: commonalities
- Example of commonality:
- Co-modulation of harmonics presented dichotically (different components go to each ear).
- Different speech components to each ear: complimentary for speech.
- Residual pitch harmonics are presented dichotically.
Binaural beats (BB) vs monaural beats (MB)
- ____ occurs in CAS, while ____ in cochlea.
- ____ occurs in lower frequency range than ____.
- ____ can occur at larger level difference between the two tones; one tone can below audible level.
- ____ can occur at larger frequency difference between the two tones.
- For ____, the two tones must be closer in terms of level
- BB occurs in CAS, while MB in cochlea.
- BB occurs in lower frequency range than MB.
- BB can occur at larger level difference between the two tones; one tone can below audible level.
- BB can occur at larger frequency difference between the two tones.
- For MB, the two tones must be closer in terms of level
Gestalt Principle
- Grouping units together
- More than simple addition
- Whole is larger than the simple sum of all parts
Cues for complex auditory task
Example: tracking a target talker in a cocktail party—multiple cues may be used.
- Spectrum profile of the talker’s speech
- Temporal stream of the speech
- Spatial separation/identification
- Many more (such as familiarity, dynamic cues etc)
- Bottom-up and top-down process involved.
Segregation by speed
- Two tones are played in sequency (high low high low high low)
- Depending upon speed, when it is slow (one stream), when it is high (sense stream separately)
Stream 1 and 2 Effect of both speed and frequency difference…
- When the F segregation is larger, you hear two streams at higher speed. However, you always hear one stream when the F segregation is small.
- This shows the impact of frequency impact on the streams
- Separation is high
- Separation is small (higher speed merges it into one)
Other phenomena and terms in signal processing
- Proximity (similarity): e.g.: similar signals for easy dichotic fusion
- Common fate (e.g., on and off together - if the onset and offset are the same, we are more likely to attribute them into one stream
- Good continuation
- Primitive process - bottom up (based upon physical features of the sound)
____learning is an activeprocessin which learners construct new ideas or conceptsbasedon their existing knowledge
Schema-based
Importance of common onset: example of common fate
- A: simple masking: on band of masker upon the signal
- B: co-modulation masking release (CMR): reduced masking effect when the noise in the signal band and side bands are co-modulated.
- C: CMR disappears due to the mismatched onset of noise between the signal band and the side bands.
Example of primitive process: ____ leads vowel sensation
FM (we won’t feel that it’s a vowel until it is frequency modulated)
Example of good continuation in vision
- Continuation when blocking by a fence, not a blank gap. Top-down process is involved, especially in the shape perception in the most right graph.
- If we replace the fence of block by a blank space, the continuity is no longer good
- When the silent gap is filled with noise, you feel the tone continue without interruption
picket-fence effect in hearing
- Bottom-up and top-down in combination
- We should be able to understand speech better when it is interrupted by noise rather than silent gap
Azimuth vs. elevation
- Horizontal plane/azimuth (the plane that we live)
- Azimuth + vertical plane forms the location of any spot on 3D space
Localization vs discrimination
- Localization error: different between Apparent location and Physical location
- Spatial discrimination: measured as minimal audible angle
Listening conditions
- Open field: stereophony (sound comes to both ear from a speaker) leads to extracranial localization (we feel the sound source outside our head)
- Close field: using headphones leads to intracranial lateralization (we feel this due to the loss of external resonance)
- Reasons: loss of external ear resonances with earphone hearing
Two general issues
- What are the cues for sound localization?
- How are they used?
Approaches (to answer the questions): behavior studies and neurological mechanisms
Duplex theory-for localization in azimuth
ITD/IPD
- Determined by size of head, larger the head
- Humans: 22-23 cm 660 micro seconds (90 azimuth)
- Time difference sensitivity: 10 microsec
difference across frequency
- Time converts to angle and phase
IID/ILD
- From shadow effect of head
- Larger for high frequency sound
ITD and IPD
- ITD = interaural time difference
- IPD = interaural phase difference
- Because of time difference, sound reaches both ears at different times
- Has to do with size of head and location
- Time difference can be converted into phase difference
IID and ILD
- IID = interaural intensity difference
- ILD = interaural level difference
- Our head is an obstacle that blocks the flow of sound
How does the time difference vary with angle in azimuth?
- Middle line: 0 degree (no time/phase difference no matter the signal)
- The difference becomes largest at 90 degrees (lateralized to your head)
- Further increase past 90 degrees is less of a difference and goes back to 0 at 180 degrees
Explain the shadow effect
- Shadow effect is remarkable (largest) at high frequency, indicated by shorter wavelength
- Near ear: no shadow effect
- Far ear: shadow effect (sound attenuated by head)
At 250 Hz, the difference is less than ____ dB (low F)
4
At 10,000 Hz, the difference is ____ dB (high F)
20
ITD is determined by azimuth and has little effect of ____
Distance
ITD produces ____ for periodical signals
IPD
IPD also depends on ____: for certain IDT, high f will have larger IPD value
Frequency
However, larger IPD does not equal ____
Stronger signal
or a fixed IPD, ITD decreases with ____
Frequency
IPD is useful cue when < ____ degrees ( F(IPD) = F(IPD-360), F=trigonometric functions)
360
When 360o>IPD>180o, the IPD acts the same as of ____
IPD<180o
Useful phase difference must be smaller than ____ degrees
180
Half period of the signal time difference must be greater than the ____ (660micros)
maximal time difference
Higher the frequency, shorter the ____ for IPD<180o
Time difference
MTD and the highest frequency for useful IPD
- To make the ½ period >MTD, Frequency must be smaller than a value.
- In order to generate useful phase difference, signal frequency must be low
- The time difference between both ears does not impact frequency, IPD does
If MTD=0.7 ms (700 micro s), what the max Fre to ensure this?
- To make period/2 >0.7 ms, 1000/1.4 = 714 Hz
- Therefore, max Fre for IPD < 180o is ~700 Hz
In order to generate useful phase difference, signal frequency must be ____
low
The time difference between both ears does not impact frequency, ____ does
IPD
Shorter the ITD, ____ for 180 phase difference.
higher the frequency
ITD and IPD - why low frequency?
- Temporal coding is better for low f
- No ILD available at low f
ITD and IPD - limitation
- Identity circle
- Must < 180 degree, change with frequency
- At 90 azimuth, ITD 650 us = _ cycle of 770 Hz
- At 45 azimuth, ITD 350 us = _ cycle of 1400 Hz
- Close to 0 azimuth, ITD –> 0, frequency limit increase, but still low f signals make strong IPD cues
The size of phase difference should only be compared at the ____.
same frequency
Sound localization accuracy in azimuth
- Always the best at 0 azimuth
- No matter what types of cues are used
- Largest ITD/IPD/ILD at 90 degree
- But larger ITD/IPD/ILD dose not mean strong stimulation
- Neurons for localization are so organized that they work best at 0 azimuth
Duplex theory
- what does HF rely on?
- what does LF rely on?
- what is this study dependent on?
- High frequency, rely on ILD
- Low frequency, rely on ITD, predictable from sphere model
- This is dependent on a study using pure tone (real life is complex tone)
At high fre, ____ also play roles
ITD
Localization accuracy across frequency
- where is the poorest performance with the duplex theory
- Get poorer performance in the middle frequency (which is unusual, because this is where performance is typically best)
- But according to duplex theory, there is poor performance in the middle frequency
limitations of duplex theory
- We do not rely upon pure tone for localization
- High frequency sound can have time cues, e.g., when modulated by low frequency
- Break down front-back confusion and identical circle by pinnae cues
Using loudspeakers or Earphones
- Why use earphones: to change phase, level independently in each ear (e.g., one ear receives stronger sound but later phase than the other)
- Halverson: 500 Hz tone, 0-180o change in phase converted to 0-90o azimuth
- Phase change leads position image change when frequency < 1400 Hz
When using earphones
- Relative effectiveness of ITD and ILD can be evaluated
- Localization versus lateralization
- Sound trapped in head
- Due to the loss of pinna effect
- ITD: only works at onset and offset
- IPD: works for continuous signals
- ITD more important than IPD: Earphone test provide answers.
Contributions from the ITD at onset and offset
- In virtual hearing (via earphone): early onset in the near ear leads to sound coming from the nearer ear (the effect of onset discrepancy), whereas early offset in the near ear leads to sounds coming from the farther ear (the effect of offset discrepancy)
- Overall, the onset ITD is dominant: in real hearing, we hearing sound based upon the onset discrepancy.
- Also there are studies comparing the effect of onset ITD and ILD: in virtual hearing, near ear can have early onset but weak sound.
Progress with improvements in technology
- From simple sound to complex signals
- Use of headphones: lost spectrum cues
- Digital technology can put back the spectrum cues
Minimum Audible Angle (MAA
- Much more accurate than sound localization
- Largest around 1-3 kHz (middle frequency)
- Smallest at 0o azimuth
Yost:- IPD shift required for image shift remains constant when f<900 Hz
- IPD shift required for image shift increases with original IPD
- Upper freq limit: 1200 Hz
- Concurrent MAA (CMAA): two signals at same time
Minimal angle for earphones
- Remains constant for up to 900 Hz
- Proportional to original (or baseline) phase difference
- At 500 Hz, a just detectable phase angle is 2 degrees or 11 microsec
MAA in lateralization
- Just noticeable phase diff at 500Hz: 2 degrees or 11 microsec
- At 1200 Hz: 12 degrees or 27 microsec
MAA in localization
just noticeable phase difference at 100 Hz: 3 degrees or 5.83 microsec
MAA is much smaller than ____
localization error
Cone of confusion
At any point on the cone surface, binaural cues are the same.
Spectral cues: localization in middle plane
- Sources
- From ear canal resonance
- From pinna effect
- Head-related transfer function (HRTF)
- Spectrum (HRTF) change with direction
- Timbre changes
- Roles: localization in vertical plane & avoiding error in azimuth
Explain HRTF
- HRTF: the difference of sound spectrum between what is measure in open space and that in real ear canal near eardrum.
- HRTF is directionally related with sound source
Spectral cues breaks down the ____
Confusion
Dynamic Cues
- Dynamic versus stationary (referred to location)
- Stationary sti can be dynamic when head is moving
- Break front-back confusion by moving head
- Head move helps monaural localization
- Small effects reported
Unilateral hearing loss you lose the ability to localize ____ (so you purely depend on ____)
sound, spectrum cues
Cues for distance estimation
- Sound level
- Ratio of direct-to-reverberant energy
- Spectral shape
- Binaural cues (ILD)
- Familiarity or experience
Precedence effect
- Sound that is heard first takes dominant role in localization
- Classical click experiment
- Fusion occurs when click interval is less than 5 ms.
- Summing localization- fused when click interclick interval <1ms
- Localization dominance, when click interval 2-5 ms, pair interval between 10-100 ms.
- Discrimination suppression
____ time difference is more important than ____ time difference
Onset, offset
Localization dominance
When T1, T2 and T3 are small enough (T1, T2 < 5 ms, 10ms<T3 <100(?) ms), listener tells sound come from left ear (the summing localization of leading ear dominants the result).
Discrimination suppression (summing localization vs. localization dominance)
- The perceived location of the fused image is affected by the size of the delay between the two signals.
- Summing localization occurs for delays shorter than 1 ms, in which case the perceived location of the fused image is affected by both the leading and lagging clicks
- Localization dominance occurs when the location of the fused image is determined by the leading signal. This occurs when the delay between the first and second clicks is between about 1 to 5 ms.
Masking level difference (MLD):diff between dichotic and diotic presentations
a) monotically (signal and masker to same ear) = sound not audible (0 dB)
b) diotically (signal and masker in both ears) = sound not audible (0 dB)
c) similar to a), but noise is added (signal monotic, noise diotic) = previously masked signal is audible (9 dB)
d) similar to b), but reverse the phase of noise = signal audible (13 dB)
e) similar to b), but reverse the phase of signal = signal audible (15 dB)
Larger MLD for ____ frequency
Low
Larger MLD for higher ____ of noise
Spectrum level
If baseline masking effect is small, ____ will also be small
MLD
Spatial distribution of binaural neurons
- Low F neurons in MSO and IC: EE type dominant
- High F neurons in LSO: IE type dominant; in IC: EI type dominant
Neurons response varies according to the…
Time response to both ears (they are mimicking the sound wave)
Neurons have certain sensitivity to certain ____
Time difference
MSO neurons have ____
Characteristic delay
What is the maximal interaural delay?
0.8 ms
IPD depends on ____
Frequency
Difference in latency is due to the signal traveling from the ____ to the neurons
Cochlea
Jeffery’s Coincident theory
- Interaual delay is countered by neural delay to make coincident
- Contra ear as near ear, it takes shorter time to get to that ear (longer for ipsi ear)
- Longer delay in contra stimulation internally is compensated by shorter time delay externally
- Coincident = certain neurons are excited at the same time because of an internal delay (mirrors external delay); put them together the stimuli to both ears arrive at the same time
What are the 3 different ways to modulate noise?
- modulated vs. unmodulated
- modulation with different MF
- modulation with different depth