Midterm Review Flashcards
What occurs with finite vs. infinite duration of a signal?
Shortening duration causes spectral spreading or splatter.
Increasing duration reduces the spectral splatter.
What is the effect of windowing signals?
Windowing increases stopband attenuation and limits spectral splatter. It also increases frequency resolution.
What is the time vs. frequency domain trade-off?
Better time resolution= poor frequency resolution
Poorer time resolution= better frequency resolution
Describe the envelope.
Slowly-moving, overall amplitude
Describe the stimulus fine structure.
Fast phase changes
Describe the general properties of pulses used in cochlear implants.
Biphasic pulses with a rate of ~1000 pps per electrode (~10000 pps)
Amplitude changes follow speech envelope
Describe current spread in cochlear implants.
Monopolar (MP) spread
- Ground electrode is located outside of the cochlea
- Current spread (bandwidth)
- Use monopolar stimulation in modern speech processing
- Electrical fields can only cover about 1/8 of the cochlea
- Bigger current spread is better because some part of the BM is getting stimulated
Different current spread configurations (bipolar and tripolar)
- Makes current spread smaller and smaller (can result in poorer speech understanding)
- Must consider interface between electrodes and neural tissue
Describe channel interactions in cochlear implants.
Only want one electrode responding at one time due to the interactions of the electrical fields.
Simultaneous firing can result in distortion of the acoustic signal.
How can signals create confounds in some psychoacoustical experiments?
Tone duration affects BW of the signal because of spectral splatter.
It is difficult for subjects with hearing loss because they may have better off-frequency listening then on-frequency listening to the CF of the target tone.
Signal with a wideband spectrum should be used, such as a pulse train.
Why do you not need to worry about off-frequency listening in CI listeners?
CI listeners are only going to receive information from the deliberately stimulated electrode
Why do you need level roving in experiments?
Level roving ensures that the subject is evaluating the differences in signals based on spectral shape and not on intensity.
It ensures that the subject is paying attention to spectral shape and not loudness (can result in better thresholds).
What are psychometric functions?
Plot P(correct) as a function of stimulus level.
Can evaluate an individual’s threshold (subject hears stimulus 50% of the time)
What is the definition of threshold?
Minimum detectable level of sound in the absence of any other external sounds.
Why is the traditional definition of threshold inaccurate?
Due to a subject’s internal decision criterion to determine if the stimulus is present or not present (Signal Detection Theory)
What are the different types of tasks?
1) Detection (Yes-No)
- Subject answers if the signal is present or if it’s not
2) Discrimination (Forced-Choice)
- Subject determines in which interval the signal is present
3) Scaling
- Subject rates stimulus, or stimulus property, on a scale
4) Matching
- Subject makes a particular sound like the other (e.g., pitch and loudness)
What are the assumptions underlying SDT?
1) There exists an internal decision variable that is monotonically related to the magnitude of the stimulus
2) The value of the decision variable fluctuates from trial-to-trial
3) The variability arises from noise (internal or external), which follows a Gaussian distribution
4) Addition of a signal only changes the mean of the distribution, not the variance
5) The subject establishes a cutoff point, the decision criterion, to determine if a signal is present or not
How can you calculate hits, miss, false alarms, and correct rejections from sample distributions?
P(Miss)= 100-P(Hit)
P(CR)= 100-P(FA)
Describe the relationship between hits, miss, false alarms, and correct rejections.
Can determine P(Hit) + P(Miss)
Can determine P(FA) + P(CR)
CANNOT determine P(Hit) + P(FA)
How can you calculate percent correct in yes-no task?
P(correct) = P(Hit) + P(CR)/2
Describe how typical thresholds change as a function of frequency for an adult?
Some higher frequencies have a lower threshold (4 kHz, 2 kHz, 1 kHz, 0.5 kHz)
Some lower frequencies have higher thresholds (250 Hz, 125 Hz)
What is the time-intensity tradeoff of detection thresholds?
Longer duration= more energy
Shorter duration= less energy
What is the integration window and integration function for different signals?
Integration window: there is a maximum signal duration needed for a threshold, which is stimulus dependent
Tones: 300 ms (0.3 sec)
- Threshold will improve (decrease) until a certain point, then saturate, and the threshold will plateau
Clicks: 2-3 ms (0.003 sec)
- Threshold is going to worsen as inter-click duration increases until a certain time, at which it will saturate, and the threshold will stay constant (plateau)
Describe the multiple looks theory.
Absolute thresholds of sounds depend on the duration.
Threshold intensity decreases with increasing duration because a longer stimulus provides more detection opportunities. (More chances through repeated sampling)
Viemesiter and Wakefield (1991) Experiment 2
- Performance improved with 2 pulses instead of 1 pulse because there is twice as many opportunities to detect the signal
- Results suggested that some type of “integration” occurs
- Input is sampled at a fairly high rate and these samples are stored in memory and can be accessed selectively
What is the history of psychoacoustics?
350 BC, Aristotle: suggested that sound is carried by air movement
1500, Leonardo De Vinci: motion of sound is in waves
1600, Galileo Galilei: found relationship between pitch and frequency
1700, Hooke: confirmed relationship between pitch and frequency
1711, Shore: created tuning fork; main method to study pitch
1800-1900s, Helmholtz: move from observation to collection of psychophysical data, stated in terms of physical acoustic variables
1966, Green and Swets: Signal Detection Theory
What is the method of constant stimuli? What are the pros and cons?
Test all signal levels and conditions.
Randomize order
Pros: Order and learning effects distributed over all conditions
What is the adaptive procedure? What are the pros and cons?
Starts easier and then gets harder.
Make rule for adaptation (e.g., 2 down, 1 up)
Choose number of turnarounds.
Average values at turnarounds to get JND (just noticeable difference)
Pros: Less time required to complete task
Cons:
- Can be affected by subject variables
- Learning effects may occur
- Get stuck in bad place if non-monotonic psychometric function
- Participant fatigue
- Do multiple measurements over multiple conditions with multiple randomizations
What is the definition of masking?
Interaction of sounds
- Targets: things you want to hear
- Maskers/Interferers: things keeping you from hearing the target
If threshold doesn’t change when a masker is introduced, then there is no masking
If threshold does change when a masker is introduced, then there is masking
Definite energetic masking
Peripheral for noise that is “certain” or “predictable”
- Neural excitation evoked by the competing speech exceeds the excitation produced by the target speech
- Caused by physical interactions between signal and masker
Define informational masking
Central for “uncertain” noise
- Interfering effect of the informational component of the masker, over and above its energetic masking effect
- Degradation of auditory detection or discrimination of a signal embedded in a context of other similar sounds
Describe the concept of the critical band.
Narrowband of frequencies surrounding the CF of the target tone
Only energy within a critical band of frequencies surrounding the signal are effective in masking the signal
- Cochlea consists of series of bandpass filters, called auditory filters
- BM responds to a limited range of frequencies, so each point corresponds to a filter with a different CF
Power Spectrum Model
- When detecting a tone in noise, a listener is assumed to be using the auditory filter with CF close to tone
- Filter passes signal, but removes much of the noise
- Only components of the noise that pass through the filter have an effect on masking the tone
- All the energy that matters is what is in the critical band
What is an excitation pattern?
Derived from calculations of auditory filter output as a function of their CF
Traveling wave gets larger as you go up in frequency on a linear scale
- Excitation pattern follows the shape of the traveling wave (opposite of the auditory filter shape)
- Asymmetrical pattern
Compare and contrast Psychophysical tuning curves with neural tuning curves.
Psychophysical tuning curves:
- Uses target and masking signals to likely measure the tuning curve of multiple neurons
- Target is fixed in level and the masker level is varied
- Threshold is the masked threshold of the target
- Sharper tips then neural tuning curves, but follow same asymmetric shape
Neural Tuning Curves
- Measure tuning curve for 1 neuron
- Threshold is measured as change in spike rate
- Target signal is varied in level
- One target signal and no masking signal
How is Basilar Membrane motion related to the critical band?
Basilar membrane vibrates in response to incoming acoustic stimuli
The traveling wave grows in amplitude until it hits the CF of the critical band
- After hitting the CF, the traveling wave dissipates quickly
What is the upward spread of masking?
Occurs when low-frequency sounds mask high-frequency sounds
If the masker is higher in intensity, then the target gets masked
A low frequency masker is more effective than a high frequency masker
What is the noise notch experiment?
Way to determine auditory filter shape that prevents off-frequency listening
Signal is fixed in frequency and masker is a noise with a band-stop or notch centered at the signal frequency
- Assume filter is symmetric on a linear frequency scale
As width of spectral notch is increases, less and less noise passes through the auditory filter (Threshold of the signal improves)
What is the band-widening experiment?
Present pure tone in the presence of a broadband noise
Both signal and noise are presented simultaneously
Signal is kept constant (frequency & intensity)
Bandwidth of the noise is increased (spectrum level is kept constant)
Results:
- Masked threshold of the signal increased
- After a certain bandwidth, no more change in the signal threshold
What are the effects of level on the auditory filter shape?
Width of critical band increases with level
- Due to auditory nerve recruitment
Tuning is broader with higher intensity
- More and more neurons are responding when the width of the CB increases
Describe tonal masking.
Target = signal tone (sine tone 1) Masker= masking tone (sine tone 2)
Target and masker are presented simultaneously
2 alternative-forced choice task
- 1 interval has masker, 1 interval has target + masker
Can result in the listener perceiver beats when f1 and f2 are close
- Summation of the 2 tones leads to an increase in intensity
- More masking on the high frequency side
- Normal combination tones (adds energy at all multiples of the original center frequency)
- Evidence of the cochlear nonlinearity
Perception of beating can be reduced or go away if there is a short enough duration
Describe co-modulation masking release (Hall et al. 1984).
Not just energy that matters in masking
- If the temporal information becomes predictable, then the thresholds can start improving
- Only true for co-modulated noise with a repeated phase (temporal information)
Wideband noise should have fast modulations in the envelope
On average, 100 Hz tone is modulated at 64 Hz (64%)
Signals:
- Adding similar modulations across different frequency bands
Results
- Thresholds increased with bandwidth with the random noise up until a certain point
- Co-modulated noise increases in threshold and then decreases (Have envelope modulations over multiple critical bands (auditory filters))
What is profile analysis?
Spectral Shape Discrimination
Take a complex tone
- Listener has to determine if the tone presented in interval 1 and 2 are different
- Compare the frequency energy from trial to trial
- Include level roving from trial to trial
a. Removes intensity cues and forces subjects to pay attention to the different spectral shapes
More tones give you more auditory filters for comparison
- Can make a subject’s threshold worse as a result of masking
Evidence for cross frequency comparison
Describe the typical psychophysical tuning curves in NH, HI, and CI listeners.
NH: sharpest
- Presence of active component in outer hair cells
HI: middle
- Some hair cells that are still firing
- Broader tuning curves, but not the broadest
- Cannot be fixed with hearing aid use
CI: broadest
- No active component in the auditory system
- No hair cells are being activated
What is temporal masking?
Not simultaneous
Generally, less effective than simultaneous masking
More “real world” such as speech, music, etc.
Forward masking: noise occurs before the tone
Possible Forward Masking Mechanisms:
- BM Ringing
- Short-term adaptation in auditory nerve
- Persistent activity in central nervous system
- Central inhibition
- Efferent system
Backward masking: noise occurs after the tone
Compare NH and HI psychophysical tuning curves. Why are these results important for masking?
Carney and Nelson (1983)
Purpose: measure tuning curves for NH listeners at lower levels and higher levels (more comparable to people with HL)
- Does the level make the tuning broader or does the outer hair cell loss make the tuning broader?
Stimuli: 2 tones played for half a second
- Not worried about spectral splatter
- If the 2 tones are close together in frequency, then you should be concerned with beats
NH listeners:
- PTCs were sharp
- At higher probe levels, broader tuning curves were recorded, indicating that level does matter
HI listeners:
- PTCs were flat, erratic, and/or inverted when compared to NH PTCs
What are the effects of informational masking (e.g., speech on speech masking)?
Complex sounds
Masking caused by stimulus uncertainty
Put a tone at 1000 Hz and tell the subject their job is to tell if that tone moved 100 Hz in frequency
- Complete the task while masking tones are presented
- Tasks becomes a lot harder when other tones are going on which tone should the subject pay attention to?
When speech is used as target and masker in informational masker, subjects get confused and cannot differentiate between the two
What are equal loudness contours?
Not based on how loud a tone is, but the level that a 1000-Hz tone must have in order to sound equally loud (phon)
Subject adjusts the level of a 1000-Hz tone until it has the same loudness as the test sound
ISO standard equal-loudness contours
- Equal loudness contours for binaural listening for loudness levels from 10-100 phons
- Similar shape to the MAF (absolute threshold) curve
- Tend to become flatter at high loudness levels
Rate of growth of loudness level with increasing level is greater for low/very high frequencies than for middle frequencies
Phon: same loudness as a 1000 Hz tone at a certain loudness level
What is the effect of BW on loudness?
Low intensity levels: loudness is nearly independent of BW
Moderate intensity levels or greater:
- If BW>CBW, loudness increases
- If BW
What is the effect of BW on excitation patterns?
As BW increases, loudness patterns decrease in height and become broader
As BW increases beyond CBW:
- Will yield broader patterns
Any time you’re within the critical band, the excitation pattern will look the same
Recruiting other neurons will change the excitation pattern (broader patterns)
What is Weber’s Law? What is the near miss?
Just-noticeable-difference (S) is a constant proportion (k) of initial stimulus magnitude (S)
Near miss
- Weber’s Law does not hold for sine tones
2 possible reasons for the near miss:
- Non-linear change in excitation patterns with increasing pattern
- Ability of individuals to use different auditory filters (More improvement and change in slope)
How can humans discriminate intensity, especially at loud presentation levels?
If intensity discrimination is based on changes in the firing rates of neurons, we expect discrimination to worsen above 60 dB SPL
Mechanisms responsible for coding intensity changes at high intensities:
- Role of neuron firing rate
- Changes in center of excitation patterns
- Spreading of excitation patterns
- If intensity increases, the width of the skirts increase
- Skirts matter; recruit more filters to encode intensity - Phase locking
- Helps encode spectral definition for suprathreshold vowels
- Limit to phase locking (May not be critical because intensity discrimination can occur at high frequencies (>5 kHz))
What is loudness fatigue?
Subject’s absolute threshold is recorded at a particular frequency, after which the subject would be exposed to a fatiguing tone of a particular frequency and intensity for a period of time.
The threshold at that frequency is recorded again and the shift in threshold is considered to be a measure of auditory fatigue.
What is adaptation?
Decreased firing rate over time when presented with the same stimulus at a fixed intensity
Way to measure temporary threshold shifts
What is TTS (Temporary Threshold Shift)?
Sometimes called synaptopothy (hidden hearing loss)
Logarithmically related to duration up to 8-12 hours
TTS (mostly) decreases after time
Increases with:
- Increasing level (L)
- Increasing duration (D)
- Frequency of exposure stimulus (Fe)
- Frequency of test stimulus (Ft)
- Decreasing recovery interval (Rl)
Two processes involved:
- Short-lived recovery process that may correspond to neural activity
- Longer process that involves hair cell/metabolic changes
What’s really happening with TTS?
- Synapses to auditory nerves are getting damaged
- Over short period of time, hair cells can get themselves back in order and cause firing patterns that were similar to before (but damage has been done)
What is loudness recruitment?
Loudness recruitment is an abnormally-rapid growth in loudness with increases in suprathreshold stimulus intensity
Perception of loudness changes with sensorineural hearing loss
- Low levels no longer audible
- High levels are the same
Effect simulated by measuring loudness in masking noise
What are 2 explanations for loudness recruitment?
Loose tuning (more auditory filters)
- Broader critical bands, which means there are wider excitation patterns
- If you have a broader excitation pattern, you’re grabbing more neurons with increasing intensity
Loss of nonlinear process (compression)
Calculate equivalent threshold shifts using the equal energy rule.
Sounds of same energy create same threshold shift
Every 3 dB more is a factor of 2 less
- Duration= 10 minutes, Level= 120 dB
- Duration= 20 minutes, Level=117 dB
- Duration= 30 minutes, Level= 114 dB